I have a question about the GPT-index principle. LLM has an upper limit amount of data, (token limit is 4096). GPT-index compresses the external data into an efficient data structure. Does the accuracy of each piece of information decrease as more external data is indexed and the compression ratio increases?
For example, if I make ExternalDataA and ExternalDataB into one index and execute a query about ExternalDataA, will the accuracy be lower than when only ExternalDataA is indexed?
If you're using a vector store, the natural result of doubling the index size is increasing the search space gpt_index has to search over. You'd have to start doing work of increasing similarity_top_k and/or improving queries such that the most relevant document is returned first.
@sho4360 @Koh to answer your question, we don't do "compression" of the data in the traditional sense of the word. Rather, we just store it in a format that we can pass to the LLM while obeying prompt size limitations. The simplest example is the vector store index as @yourbuddyconner mentioned, where we split your text into chunks under the hood, and when you call "query" on the data, we fetch the top-k text chunks and put it into the prompt, which we can then feed into GPT. Note that even if the top-k text chunks don't completely fit into one prompt, GPT Index can handle that for you, by calling the LLM repeatedly over sequential prompts.
Thank you all for your answers. I probably misunderstood about the prompt that GPT-index passes to LLM, but I understand it. I will try adjusting the query such as similarity_top_k.