Claude 2 is very impressive with its 100k token input compared to GPT's 4 32k and its 5x cheaper API cost. It also can take in 50mb of data. Does anyone know how it works when it takes in data files? You can type in prompt text of 100k tokens but if you input a file, you can put in an entire book. When you input a file, is it making summaries of sections/chapters then deleting the original data and answering your questions from these summaries (this would be less detailed and prone to halucination), or is it interpreting all of the data at the same time (same performance if you input a small text into a prompt), or is it making summaries and keeping acess to the orignal paper for reference (like how vector index work)?
With llamaindex, how do we increase the amount of information retrieved by a type of vector store index to take full advantage of the 100k tokens? Did you just say that it is done automatically? Or does it need to be done manually, or is it in the works?