- It would take many many GBs of text for the local JSON approach to become too much. Most use cases allow for individual indexes per user/use-case, which helps manage the size of each json.
But! Recently we introduced a mongo-based doc store (
https://github.com/jerryjliu/llama_index/blob/main/examples/docstore/MongoDocstoreDemo.ipynb), with other docstores planned for the future. If you have many documents, I also recommend using a 3rd party vector store to more effeciently store the vectors (rather than saving them to disk and loading into memory).
- In most cases, using a 3rd party vector store (and perhaps now the mongo doc store) should alleviate any concerns on scalability
- When you create a vector index (using GPTSimpleVectorIndex or any other vector store integration), embeddings are created during index construction and saved. At query time, it embeds your query text to fetch the most relevant chunks
- I have yet to see a case where it's slow to fetch relavent documents. 3rd party vector stores are hyper optimized for this, and even the local vector index is very fast (I have 1GB index.json that is fast to query)
Overall the node lookup is usually very fast, its the LLM response that can take time (either OpenAI servers are busy, or maybe you are running an LLM locally and it is slow too lol)