Find answers from the community

Updated 2 months ago

Claude

Claude 2 is very impressive with its 100k token input compared to GPT's 4 32k and its 5x cheaper API cost. It also can take in 50mb of data. Does anyone know how it works when it takes in data files? You can type in prompt text of 100k tokens but if you input a file, you can put in an entire book. When you input a file, is it making summaries of sections/chapters then deleting the original data and answering your questions from these summaries (this would be less detailed and prone to halucination), or is it interpreting all of the data at the same time (same performance if you input a small text into a prompt), or is it making summaries and keeping acess to the orignal paper for reference (like how vector index work)?
L
a
9 comments
As far as I know, its interpreting the raw 100k tokens

In a demo with claude2, they modified one sentence of the Gatsby book, and it was able to find it 🤯
Obviously using all 100k tokens in the input leads to slower responses though
that's awesome
what about when you put in a 10 mb pdf file? Is it able to interpret all the raw tokens even though they can be above 100k tokens?
If it could interpret more than 100k tokens in a pdf file all at the same time, than why would they cap off its text prompt at 100k?
It's capped off at 100k tokens due to how the model was constructed/trained.

In llama-index, if the context goes over the max token limit, it will refine the answer across multiple llm calls
With llamaindex, how do we increase the amount of information retrieved by a type of vector store index to take full advantage of the 100k tokens? Did you just say that it is done automatically? Or does it need to be done manually, or is it in the works?
You can either increase top k, or increase the chunk size in the service context, or both!

Personally increasing the top k is probably the best method

index.as_query_engine(similarity_top_k=6)
This would retrieve 1024*6 tokens of context

Could even push this much higher
Add a reply
Sign up and join the conversation on Discord