High context length LLM vs Low context length

At a glance

The community members are discussing the impact of using a model with a higher context length (128k) versus a lower context length (16k) when doing Retrieval Augmented Generation (RAG). They are wondering if a higher context length model will have larger chunk size or node size, and are seeking an explanation or links to relevant articles.

The comments suggest that the community members have done some research and found that they can configure the similarity Top K to retrieve more chunks. However, they are unsure if using a high similarity Top K is always better than using a lower one, or if there are scenarios where a lower similarity Top K might be preferable. The comments also note that retrieving too much data can make it harder for the Language Model (LLM) to synthesize an accurate response, and that it can also increase latency and cost.

YYj

if i use a model with higher context length ( 128 k ) vs a lower context length (16k) will it help when i do RAG with higher context length?

does larger context length model have larger chunk size? or node size?

can someone please explain, thank you. Or link me to any article that talks about this

5 comments

YYj

explanation is needed, thank you

YYj

did some research, i can configure the similarity Top K to retrieve more chunks

YYj

lets say if i use claude 3 models, which means it has much higher context length. Does using high similarity Top K always better than using lower? or is there scenario i want to use lower similarity top k

TTeemu

If you retrieve too much data it will be harder for the LLM to synthesize an accurate response

TTeemu

Also the latency and cost will be higher

Add a reply

Find answers from the community

High context length LLM vs Low context length