Find answers from the community

Updated 4 months ago

Why Refining ?

At a glance

The community member has set the input size to 3500 and the chunk size to 750. When querying the index with top_k=4, the system always performs refining. The community members discuss possible reasons for this behavior, including the use of top_k>1 and the chunk size settings. One community member suggests removing the chunk_size_limit from the prompt helper to avoid the refining behavior, which the original poster confirms resolves the issue.

I have a case in which I have set the input size to 3500, and the chunk size is 750. When I query my index with top_k =4, it always does refining. Any idea why this happens??
Note: qa prompt size is 93 and the query around 50 tokens
r
z
L
12 comments
If you are using top_k > 1. It uses create and refine synthesis approach to generate answer/ response. So refining is expected right?
no, as the total prompt size didn't exceed 3500
Where did you set the chunk size?

It might help if you are able to package a minimum example πŸ™‚
I set the chunk size in the prompt helper, service context, and node parser. All these have the same chunk size value
Do you have an example you can share to reproduce this? I would like to step through the code with a debugger to investigate πŸ™‚ Just need some sample docs+code, if possible?
I,cannot share the documents as it's for our clients. but I can share the way I set the index with you Note that the llm model and embedding model are custom

prompt_helper = PromptHelper(max_input_size=3500, chunk_size_limit=750, num_output=256, max_chunk_overlap=75) node_parser = SimpleNodeParser( text_splitter=TokenTextSplitter( chunk_size=750, chunk_overlap=75 ) ) service_context = ServiceContext.from_defaults(llm_predictor=self.llm_predictor, prompt_helper=prompt_helper, embed_model=self.embedding_model, chunk_size_limit=75, node_parser=node_parser ) nodes = node_parser.get_nodes_from_documents(documents) storage_context = StorageContext.from_defaults( vector_store=ChromaVectorStore(chroma_collection=chroma_collection) ) storage_context.docstore.add_documents(nodes) self.index = GPTVectorStoreIndex(nodes=nodes, storage_context=storage_context, service_context=self.service_context )
I will run with those settings and see if I can reproduce.

I will step over the code line by line to confirm what's happening lol
okay looking forward to your response
@zainab to avoid refine, here, try removing chunk_size_limit from the prompt helper πŸ™‚
This is limiting how big each chunk can be when calling the LLM. By setting it the same as the chunk size in the node parser, you will make at least one call per node
Many thanks it's now working
Add a reply
Sign up and join the conversation on Discord