Efficient Indexing Strategies for Large-Scale RAG Syste...

At a glance

The post asks how community members can load indexes fast and add documents to multiple indexes (100+) when building a RAG (Retrieval Augmented Generation) system. A community member suggests using a vector store like Qdrant or Postgres with a single index and metadata filters to create separation, as this allows for instant loading and filtering of data.

The comments then discuss the community member's attempts to filter vector search results by metadata in Qdrant using LlamaIndex, which do not seem to work as expected. They try various approaches, including using MetadataFilters with ExactMatchFilter and MetadataFilter, as well as direct Qdrant payload filters, but none of these work as the results still include data from multiple space_id values instead of just the one they want.

The community members discuss potential issues with the data type of space_access.space_id (string or integer), and one community member provides a working example. Another community member suggests the issue may be related to the implementation of the lla-index-qdrant library rather than LlamaIndex itself. After further investigation, a community member finds and fixes an issue in the LlamaIndex

Useful resources

ttheoxd

Hi, I have a few questions for people building RAG but with multiples indexes (100+). How do you guys load the indexes fast & and add documents to those?
Do you just have an efficient cache system or is there 'better' ways of doing this?

19 comments

LLogan M

usually if you use a proper vector store (qdrant, postgres, etc.) I would set it up to be a single index and use metadata filters to create separation

Then, loading is instant (because its on a server connection) and filters control what data you have access to

ttheoxd

Thank you, however i'm confused on why what i'm doing doesn't work:

I'm trying to filter vector search results by metadata in Qdrant using LlamaIndex, but nothing seems to work. Here's what I've tried:

Using MetadataFilters with ExactMatchFilter:

Plain Text

filters = MetadataFilters(filters=[
    ExactMatchFilter(key="space_id", value=space_access.space_id)
])

Using MetadataFilters with MetadataFilter and FilterOperator:

Plain Text

filters = MetadataFilters(filters=[
    MetadataFilter(key="space_id", operator=FilterOperator.EQ, value=space_access.space_id)
])

Using direct Qdrant payload filters:

Plain Text

filter_condition = {
    "must": [
        {
            "key": "space_id",
            "match": {"value": space_access.space_id}
        }
    ]
}

None of these work - I keep getting results from multiple different space_ids (2, 6, 7, 8) instead of just the one I want.

Looking at my data in Qdrant (from the response), I can see the metadata is structured like this:

Plain Text

"metadata": {
    "id": 14,
    "name": "report.pdf",
    "space_id": 6,  # This is what I want to filter by
    ...
}

What's the correct way to filter by a nested metadata field in Qdrant using the latest LlamaIndex version? Can you show me a complete example that actually works with nested metadata fields?

LLogan M

Is space_access.space_id a string or an int? It looks like it's stored as an int, so it will need to be filtered using an int

LLogan M

Works fine for me here @theoxd https://colab.research.google.com/drive/1DgkBdiMWOmNGMZ888gpf1OHbhYzrWL3Q?usp=sharing

ttheoxd

Well, I think the error I had came from the fact I wasn't using the retriever but rather VectorStoreQuery which don't seem to work
query = VectorStoreQuery(query_embedding=query_embedding, filters=filters, similarity_top_k=5)
result = vector_store.query(query)

ttheoxd

It was an int, but I tried turning it to an str, didn't work aswell

LLogan M

Hmm it should be the same. I'll try that way in a bit

ttheoxd

https://colab.research.google.com/drive/1DgkBdiMWOmNGMZ888gpf1OHbhYzrWL3Q?usp=sharing#scrollTo=GsxmUKMSMzKF

ttheoxd

anyways that issue seems related to the implementation of lla-index-qdrant rather then llama-index $

LLogan M

hmm no I found the issue, dug a little deeper:
https://github.com/run-llama/llama_index/blob/2c85e1c27f5d39740cc9dc0a4cd71eb5777886a4/llama-index-integrations/vector_stores/llama-index-vector-stores-qdrant/llama_index/vector_stores/qdrant/base.py#L1144

Workaround would be
query = VectorStoreQuery(query_embedding=query_embedding, filters=filters, similarity_top_k=5, query_str="unused")

Easy pr, not sure why that if statement is there

LLogan M

https://github.com/run-llama/llama_index/pull/17377