Find answers from the community

s
F
Y
a
P
Updated 2 years ago

Llama Index Secret Sauce

Guys a question: why when I use langchain for QA based on a Pinecone vectorstore I hit immediately the model's maximum context length and why when I use gpt_index for querying the same vector_store this doesn't happen? Which kind of magic does gpt_index use? ahahah
1
L
A
h
4 comments
Llama Index ensures that every call to the LLM is constrained by the models maximum context length.

The magic is how llama index breaks a long input into multiple chunks and refines an answer across the chunks.
YEP. I cant understand how ppl use langchain without this artifact?!?!
Then what do they use it for?
@Logan M thanks a lot for such clarity. We are building qa bot where users can select gpt3. 5 , llama index pinecone query, llama index vector direct query and langchain vector db chain direct query.

We end up with token limit for langchain calls.

We are working on langchain for chat bot and agent as llama for search n find. As we don't need to worry about summarization and handling bigger chunks.
Add a reply
Sign up and join the conversation on Discord