Find answers from the community

Updated 4 months ago

Hi Guys, i am in the process of building

Hi Guys, i am in the process of building a rag application using llamaindex, already i have reference to look for same rag application done by peers using langchain, in both cases we use chromadb as vector, however there is a function VECTOR_STORE.search_by_vector(question_embeddings, k = 3) to search by the vector in langachin, where question_embeddings are the embedded values of the input and now we are looking for similarilty search with in our vectorstore to return the closest of 3 matches, how to achieve this using LLAMAINDEX, #❓py-issues-and-help ????
W
M
6 comments
Basic structure for finding relevant text follows the same principle only.
You dont need to convert the text into embedding in LlamaIndex. LlamaIndex will do all these things on its own. You just need to pass in the query.
https://docs.llamaindex.ai/en/stable/examples/vector_stores/ChromaIndexDemo/?h=chroma
Thanks @WhiteFang_Jr for your response, i see your point but how about when i want to know the list of closest searches related to my question , how do i retrieve that information, say for instance k=3 to see the 3 top matches related to the question in vectors ?
For the top_K, you can define this value in your query_engine or retriever to fetch as many nodes as you want.

For fetching the vectors for these nodes, Chroma implementation does not return embeddings as of now.

You can make some changes in this to get the embeddings as we well along with the text. https://github.com/run-llama/llama_index/blob/17f23014953e07eb8f8e7690d4cca7fb26c2109c/llama-index-integrations/vector_stores/llama-index-vector-stores-chroma/llama_index/vector_stores/chroma/base.py#L378
@WhiteFang_Jr oh okay , in my use case incoming questions are related to information in tables of the database, so in order to identify the right table we process the questions to break it in to mulitple words ( means tables) and then convert them as vectors, so that llm will look for the top 3 tables and decide which one to go and execute the sql , from your description it seems this is not possible to achieve it in llamaindex at this moment? is that right
This looks like a specific use case. Let me know if I'm saying it correctly.

  1. Based on your query, you retrieve related nodes from the vector store.
  2. Then check with llm which nodes suits better to your query and then execute it?
@WhiteFang_Jr , partially correct . let me tell you what is happening in Langchian, say if the question is how many customers we have? this input goes through goes through the process of our custom function to generate the response in json as ['many customers', 'How many customers do we have?'], this json is converted to embeddings, then that embeddings is searched in the vector index with k=3 to get the closest results, in this case tables and its metadata
Add a reply
Sign up and join the conversation on Discord