Find answers from the community

Updated 4 months ago

I am trying to implement a

At a glance
I am trying to implement a recommendation from https://gpt-index.readthedocs.io/en/latest/end_to_end_tutorials/dev_practices/production_rag.html to "decouple chunks used for retrieval from chunks used for synthesis" by

1.) generating a summary for each node
2.) storing an embedding of a summary along with the original text corresponding to the summary
3.) using the summary embedding during the retrieval step
4.) using the original text during the synthesis step

I was hoping that using DocumentSummaryIndex as recommended in the above-linked documentation will be the simplest way to do that, however, I noticed that this index persists summaries and their embeddings into DocStore, not VectorStore (in my case, MongoDB). I am wondering about the performance of this in production scenarios (with tens of thousands of chunks). I'd like to find a solution with llama-index where embeddings would be stored in a vector database.

so far, what I've managed to come up with is to use plain old VectorStoreIndex and a custom subclass of BaseEmbedding, in which I would call LLM to generate a summary of a node and instead of storing embeddings of the node, I would store embeddings of a summary. this feels hackish to me, is there a better approach somebody can think? ideally, I am looking for something that would enable me to preserve also the summaries in their textual form, not only as embeddings
L
L
5 comments
@Logan M can you please chime in here? hate to bother you directly, but anything you can add will be helpful
Actually we are just about to fix the embeddings for the document summary index πŸ™
that's very opportune πŸ™‚ looking forward to see the improvements. can you share how it will work, at least on high-level?
It should be storing the embeddings for each summary in the vector store πŸ‘
So query the vector store, find the closest summaries, then use those summary nodes to fetch the proper documents
Add a reply
Sign up and join the conversation on Discord