Find answers from the community

Updated 3 months ago

I still have some questions about

I still have some questions about LlamaIndex, as I am unsure about the appropriate use cases for it. Please allow me to list my concerns:

  1. I understand that LlamaIndex involves splitting text into chunks, obtaining embedding information, and then storing that along with metadata in a JSON file. As this process accumulates a significant amount of data, would it be difficult to use this method for services that require extensive data storage?
  1. If scalability is indeed a challenge due to reason 1, in what situations would LlamaIndex be most effectively utilized?
  1. When loading text data from a database, I presume that no vector-based index is created. How, then, does LlamaIndex efficiently search for relevant information in response to a query?
  1. If searching external sources proves to be slow due to reason 3, when would it be advantageous to use LlamaIndex?
My stance is that I see potential in LlamaIndex, but some questions have arisen as I've been researching. I appreciate your assistance in addressing these concerns.
L
3 comments
  1. It would take many many GBs of text for the local JSON approach to become too much. Most use cases allow for individual indexes per user/use-case, which helps manage the size of each json.
But! Recently we introduced a mongo-based doc store (https://github.com/jerryjliu/llama_index/blob/main/examples/docstore/MongoDocstoreDemo.ipynb), with other docstores planned for the future. If you have many documents, I also recommend using a 3rd party vector store to more effeciently store the vectors (rather than saving them to disk and loading into memory).

  1. In most cases, using a 3rd party vector store (and perhaps now the mongo doc store) should alleviate any concerns on scalability
  1. When you create a vector index (using GPTSimpleVectorIndex or any other vector store integration), embeddings are created during index construction and saved. At query time, it embeds your query text to fetch the most relevant chunks
  1. I have yet to see a case where it's slow to fetch relavent documents. 3rd party vector stores are hyper optimized for this, and even the local vector index is very fast (I have 1GB index.json that is fast to query)
Overall the node lookup is usually very fast, its the LLM response that can take time (either OpenAI servers are busy, or maybe you are running an LLM locally and it is slow too lol)
@Logan M Thank you very much for your explanation. It has been incredibly informative. It seems that I might have misunderstood the concept initially. The process involves passing the existing data source through LlamaIndex for embedding and indexing first, and then performing searches on it (I thought that the search would be performed directly on the existing data source).

I understand the flow as described below. Thank you for your assistance.
Attachment
NP51JuGm58Jl_HKlU_2aVu23MGqNaf8Om9Fny6YVRYEqs1RSd-_hGj5m8inClvd2EKJqSHu7WIew3w-1l9XOCifCQ2ESv80ZLbRJzGGOG5P2OyGU0y6fAP-XtDvkBWl_xDhPJmUbKu7HQw0k3tHPttH-2Y4hU7XCZ0BUskwfwTuRYjxGGh138rqZ6FlX_8ZHE2iudax9ge3ku34mM-qqcEH51-eoswzm3pJsfE4EJBhR6kLv7EMnQw58NsCbx8lQkH-PXF0zKyWb-IIXLnVWVl0qO2IzbOcwp9XyeqCYJvLrmbXzi05PsXivzOTJP1PYF_oGtoXfTW4ltYr6PmimC3DCpWOIPsv8__mN.png
Exactly! By "indexing" the data first, it allows the queries to run as fast as possible

The diagram looks correct to me! 💪
Add a reply
Sign up and join the conversation on Discord