Length

At a glance

The community member is asking about the maximum prompt size between the text_qa_template and refine_template mentioned in the compact response mode. They are retrieving a long context and it seems to be cutting their prompts within 4000 tokens, and they want to know why. Another community member responds that the maximum prompt size depends on the language model (LLM) being used.

Useful resources

GGoat

Hi, may I know what's the maximum prompt size between text_qa_template and refine_template mentioned in the compact response mode here? https://docs.llamaindex.ai/en/stable/module_guides/querying/response_synthesizers/#:~:text=Details%3A%20stuff,between%20text%20chunks).

I am retrieving a long context and looks like it cuts my prompts within 4000 tokens and refine the remaining context. May I know why?

Plain Text

Details: stuff as many text (concatenated/packed from the retrieved chunks) that can fit within the context window (considering the maximum prompt size between text_qa_template and refine_template). If the text is too long to fit in one prompt, it is split in as many parts as needed (using a TokenTextSplitter and thus allowing some overlap between text chunks).

1 comment

LLogan M

It depends on the llm you are using

Add a reply

Find answers from the community

Length