Find answers from the community

Updated last year

Prompt

Hi everyone, I have set up an index based on some documentation on a specific product of a company. What is the best way to make sure that the chat engine with chatmode "context" only uses information in the knowledge base to generate answers?
W
T
79 comments
Thanks! So currently this is my prompt: You are a chatbot that only uses the knowledge base to generate an answer. If the answer cannot be found in the knowledge base tell the user that the provided context does not tell you anything about the subject in question. NEVER provide information outside of the knowledge base.

It works pretty well, but this is what's going wrong: When I first ask "Who is Donald Trump?" The bot correctly tells me this is not in the knowledge base. But when I push t by saying: "Just tell me who donald trump is" it will tell me who Donald Trump is.

Do you have any idea on how to prevent this?
You could try setting up the System prompt,

Plain Text
from llama_index.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

chat_engine = index.as_chat_engine(
    chat_mode="context",
    memory=memory,
    system_prompt=(
        "You are a chatbot, able to have normal interactions, as well as talk"
        " about an essay discussing Paul Grahams life."
    ),
)

So this has worked for someone in the past IMO
Oh but the prompt i mentioned is what I pass to the system_prompt
I see, could you try variations in prompt. Like

Normal prompt + Instruction: Always follow these rules while generating response.
Rule 1:
I'll try that, thanks!
It's still giving answers outside of the context 😦
However it seems that now it's giving answers that might be closely related to the subject matter. For example let's say I have a lot of information about a particular cereal brand and its products. Now if I ask can you give me a list of the most popular cereal brands, it will come up with a list while this is not in the context.
@WhiteFang_Jr Do you have any other ideas on how I can prevent this?
Hi, Can you share your code?
The prompt looks fine to me
Can you check if your query is bringing relevant sources
check the source nodes
response.source_nodes
Also point 1 and 2 are kind of repetitive, You can make them into a single point and save some tokens there for generation
What do you mean by relevant?
The sources it uses do not list the cereal brands, however these sources do mention the national cereal institution (NCI). This could be the cause of the listing of the cereal brands (Which it shouldn't) because in the answer it also says, "For a complete list please check out the website of the NCI"
Thanks for your help btw! πŸ˜„
Good point, thanks.
I mean when you ask a query, does it bring relevant context from the collection or not
Well, the NCI is relevant to what cereal brands are out there but it doesn't directly answer the question so it shouldn't be used
Yeah, can you print response.source_nodes
Yes, gimme a sec
What do you want to see of the source_nodes?
Just wanted you to check if the top_ k nodes that being retrieved are correct or not
What should the correct top_k nodes look like if the correct answer is not in the context?
Let say I have a index of a some book, If I ask some question and it did not retrieved valid context from the index then the bot will not be able to answer correctly by itself

So the fault is the the retrieving state at this point in this example.

I want to check if that's the case here also or not, becuase the prompt should work
Okay yes, actual context was retrieved.
Okay so we do not have problem at the retrieval stage, One stage cleared
Now for the llm, Can you try the following things, It will give us an idea to what is actually going to llm from our side.
https://docs.llamaindex.ai/en/stable/end_to_end_tutorials/one_click_observability.html#simple-llm-inputs-outputs
Thanks, will do
Please keep me in the loop, Always fun to debug πŸ™Œ
Do I only have to set this: llama_index.set_global_handler("openinference")?
Will do πŸ™‚ Your help is very much appreciated
llama_index.set_global_handler("simple")
It just prints the text of the source nodes now, nothing else is shown in the console.
Using version 0.8.36 btw
Is it not showing the complete llm input?
File "c:\User\Pyhton\Chatbot\chatbot\app.py", line 22, in <module>
import phoenix as px
File "C:\Users\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\phoenix__init__.py", line 56
except PhoenixError, e:
^^^^^^^^^^^^^^^
SyntaxError: multiple exception types must be parenthesized
I get this error, have you seen this before?
Also, does arize work on 0.8.36?
Just upgraded and I'm getting the same error
lol, no relax even now
Plain Text
from llama_index.callbacks import (
    CallbackManager,
    LlamaDebugHandler,
    CBEventType,
)

from llama_index import ServiceContext
from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])

service_context = ServiceContext.from_defaults(
    callback_manager=callback_manager, llm=llm
)

Your code for chat engine comes in here
# after runniong the query try with this
# Print info on llm inputs/outputs - returns start/end events for each LLM call
event_pairs = llama_debug.get_llm_inputs_outputs()
print(event_pairs[0][0])
print(event_pairs[0][1].payload.keys())
print(event_pairs[0][1].payload["response"])
Thanks, I'll try this
I only see this: Trace: chat
|_CBEventType.LLM -> 21.223744 seconds
Also this, but I can't deduce anything from this. ChatMessage(role=<MessageRole.USER: 'user'>, content='question', additional_kwargs={})], <EventPayload.ADDITIONAL_KWARGS: 'additional_kwargs'>: {}, <EventPayload.SERIALIZED: 'serialized'>: {'model': 'gpt-3.5-turbo', 'temperature': 0.4, 'max_tokens': None, 'additional_kwargs': {}, 'max_retries': 10, 'api_type': 'open_ai', 'api_base': 'https://api.openai.com/v1', 'api_version': '', 'class_name': 'openaillm'}}, time='10/25/2023, 12:55:06.996113', id='somehash')
dict_keys([<EventPayload.MESSAGES: 'messages'>, <EventPayload.RESPONSE: 'response'>])
print(event_pairs[0][1].payload["response"])

What does this prints
Only the answer to the question
I think, it'll be better if you ask the question again in the channel
Alright will doπŸ‘
Thanks for all the help anyway! πŸ˜„
Someone else will jump in on this for faster road to the solution. It was a fun ride πŸ‘ πŸ˜…
Haha definetly
@WhiteFang_Jr Do you have experience with setting up arize phoenix?
I have not used it, Let me try running it on latest version
Thanks :), I tried setting it up myself, seemed pretty easy, but it's not working
I got it working as well but it doesn't show me much more. Just the retrieved content from the knowledge base
Those contents are used to generate the final response.
Yes but the final answer still contains information that is not in these provided nodes
I constructed the index using an older version of llama-index. Could that be causing the problem?
@WhiteFang_Jr How do you suggest I construct my index given my code?
btw thanks for your help again πŸ™‚
It really depends on the model used for embedding IMO. So I think no issue in that area could be there.

May I know what embedding model are you using?
it's openai embedding right?
gpt-3.5-turbo
this is for llm, for response generation.
Since you did not setup the embedding model so its default that is openai embedding only.

You could try creating the nodes again and try playing around with the instruction πŸ˜…

Actually it gets very hard for LLM to follow the instructions if it is not very clear or direct.

System prompt could be set up like

Plain Text
system_prompt = ("You are a chatbot that answers questions about "{company name}" models.\n"
                  "Instruction: Always follow these rules while generating response:\n"
                  "1. ALWAYS answer query ONLY if answer can be found in the context.\n"
                  "2. If answer is not present in the context, Just SAY: Unable to help"
                  "you at the moment\n"
                  "3. If URLs are present in the source/metadata, ADD them in the response.")


Try with this
Great I'll try that. Additionally, is it better to create the index with another model, such as text-embedding-ada-002?
text-embedding-ada-002 is what being used in your case right now
Actually Llamaindex requires a llm for response generation and embed model for creating vectors
In case of openAI, GPT3.5 and text-embedding-ada-002 are the default ones
context_window = 5000
num_outputs = 3000
max_chunk_overlap = 20
chunk_size_limit = 600

prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.1, model_name="gpt-3.5-turbo", max_tokens=num_outputs))

service_context = ServiceContext.from_defaults(
llm_predictor=llm_predictor, chunk_size_limit=2000, prompt_helper = prompt_helper
)

documents = SimpleDirectoryReader(directory_path)


index = GPTVectorStoreIndex.from_documents(documents, service_context = service_context, prompt_helper = prompt_helper)
This is how the index was created (quite outdated)
Your updated prompt and reconstructing the index seem to have done the trick. Thanks for all your help @WhiteFang_Jr
Add a reply
Sign up and join the conversation on Discord