Hey!
Since you are not passing top_k value while creating your engine, so its picks the default value which is 2.
Which means that it would only use 2 nodes to form the response. To increase this, you'll have to set top_k value to a higher value, you can do this like this:
chat_engine = index.as_chat_engine(...,similarity_top_k=10)
Now this will use 10 nodes to form the answer
sure that is helping but when I'm making similarity_top_k value to more than 3 then getting below error
Traceback (most recent call last):
File "C:\Users\ajinkya\Desktop\ISP_Qual_Pro\isp_qual_pro_api\src\nlp\gpt.py", line 58, in ask_engine_que
response = engine.chat(question)
File "C:\Users\ajinkya\Desktop\ISP_Qual_Pro\isp_qual_pro_api\env\lib\site-packages\llama_index\callbacks\utils.py", line 41, in wrapper
return func(self, *args, **kwargs)
File "C:\Users\ajinkya\Desktop\ISP_Qual_Pro\isp_qual_pro_api\env\lib\site-packages\llama_index\chat_engine\context.py", line 162, in chat
all_messages = prefix_messages + self._memory.get(
File "C:\Users\ajinkya\Desktop\ISP_Qual_Pro\isp_qual_pro_api\env\lib\site-packages\llama_index\memory\chat_memory_buffer.py", line 110, in get
raise ValueError("Initial token count exceeds token limit")
ValueError: Initial token count exceeds token limit
That's because the default token limit is being breached here
try doing this:
from llama_index.core.memory import ChatMemoryBuffer
# This will increase the default memory limit
chat_memory = ChatMemoryBuffer.from_defaults(token_limit=20000)
chat_engine = index.as_chat_engine(...,similarity_top_k=10, memory=memory)
Hello @WhiteFang_Jr
Thank you for your kind assistance earlier. I attempted to modify the similarity_top_k value up to 100; however, the response remains unchangedāpartial, brief, and not considering all the available information.
For context, the dataset size I am working with is approximately 1.5 million characters. Below is the sample code I am using:
def create_engine_from_index(index, logger=log, req_id: str = "", sys_prompt=system_prompt):
try:
# engine = index.as_chat_engine('context', system_prompt=sys_prompt, similarity_top_k=3)
chat_memory = ChatMemoryBuffer.from_defaults(token_limit=20000)
engine = index.as_chat_engine(similarity_top_k=100, system_prompt=sys_prompt, memory=chat_memory)
return engine
except Exception as ex:
logger.exception(f"Exception occurred during creating chat engine from index: {ex} {req_id}")
return None
hey!
You need to check the source nodes that the llm is using for creating the final response.
See if the required nodes are being retrieved or not in the first place.
You can find the source nodes in the response object: print(response.source_nodes)
ok but is it possibale to include all teh nodes created for 1.5 millon char. length data ?
no data size is 1.5 mil Characters and don't Know exactly how many nodes it will form but whatever number of node it will form will it be possible for to include all the nodes it might be go more than 500-1000 nodes and as I am increasing the number of nodes as similarity top K value the processing time is increasing
nodes are formed based on chunk size. default is 1024 tokens. You can check the total number of formed nodes by this:
print(index.docstore.docs)
Yeah time will definitely increase with increase in top_k value
how to set chunk size of node to max ? while creating index ?
There is no max size, it's up to you what you want to set.
But I wouldn't recommend setting it to max size.
You need to factor in the model context size while setting the chunk size.
Hello @WhiteFang_Jr i have include all the nodes which are created on the data size of 1.5 mil Characters which was around 430-450 nodes, but still i have got response partially example i have 20-24 respondents in the data and I'm getting response for only 3-4 responded
Hey!
Not sure what you mean by i have got response partially example i have 20-24 respondents in the data and I'm getting response for only 3-4 responded
Are you getting 3-4 nodes in the response even after setting 20 as the limit?
i have got response partially example i have 20-24 respondents in the data and I'm getting response for only 3-4 responded
in above sentence i was trying to say that i have some data which is the Conversational data (transcription of meeting audio), In that conversation some participants are discussing about watch purchasing Experience, almost all the participants in the conversation has discussed about their experience related to watch patches purchase
when I am creating chat engine engine on that data and asking the query - tell me about watch purchasing experience for all the participants in the conversation in that case it should respond me against all the participants or give me the response for all the participants but response I'm getting is only for 2-3 participants (actual participants are 23-24 data for all the Participants can be spread across all the nodes which will be created)
hello any suggestion plz?