Find answers from the community

Updated 3 months ago

At present I have built a vector search

At present, I have built a vector search database of existing text using chroma vector database, which can quickly search the relevant content that has been converted into vectors.
But now I have 600 megabytes of text content in plain text stored in the elaesticsearch database. I hope to combine chorma and elaesticsearch data content to feedback llama_index related questions.
Specific ideas are as follows:
  1. I want to search directly in plain text and retrieve the elaesticsearch database to return semantically relevant content.
  2. Then vector search results based on chroma vector database, and then vectorization conversion based on the text search results of elaesticsearch.
  3. Combine the results of the two searches.
How should it be done?
L
1 comment
Probably with a custom retriever + re-ranker

You'll want to retrieve from both chroma and elastic search

Then use a re-ranker to filter down to the true top k

Example of using a custom retriever
https://gpt-index.readthedocs.io/en/stable/examples/query_engine/CustomRetrievers.html

Example of re-ranking
https://gpt-index.readthedocs.io/en/stable/examples/node_postprocessor/SentenceTransformerRerank.html

I would use this model as a reranker these days https://huggingface.co/BAAI/bge-reranker-base
Add a reply
Sign up and join the conversation on Discord