Find answers from the community

Updated 3 months ago

Length

Hi, may I know what's the maximum prompt size between text_qa_template and refine_template mentioned in the compact response mode here? https://docs.llamaindex.ai/en/stable/module_guides/querying/response_synthesizers/#:~:text=Details%3A%20stuff,between%20text%20chunks).

I am retrieving a long context and looks like it cuts my prompts within 4000 tokens and refine the remaining context. May I know why?
Plain Text
Details: stuff as many text (concatenated/packed from the retrieved chunks) that can fit within the context window (considering the maximum prompt size between text_qa_template and refine_template). If the text is too long to fit in one prompt, it is split in as many parts as needed (using a TokenTextSplitter and thus allowing some overlap between text chunks).
L
1 comment
It depends on the llm you are using
Add a reply
Sign up and join the conversation on Discord