Find answers from the community

Updated 2 months ago

Several Questions (sorry):

Several Questions (sorry):
  1. I need to bring my index retrieval time down to less than 10 seconds consistently. Is this even possible?
  2. I switched from in-memory vector store to Redis Vector Store which helped a lot with speed but it sometimes still takes upwards of 20 seconds. Is this just a feature of using redis or is there a good chance I am doing something wrong? My 3 indexes have ~2000, 1, and 1 document in them respectively, but even the single document sometimes can timeout.
  3. If not Redis, is there another fast, free external vector DB I could try?
L
M
5 comments
are you sure you are talking about retrieval time? Or full synthesis time?

Usually retrieval takes 1s or less, but generating a response with that retrieved text can take 3s+
I am using the response from the index as part of my prompt for a gpt response. Is there a way to cut out the synthesis step?
If you just want retrieval, you can do

Plain Text
retriever = index.as_retriever(similarity_top_k=2, ...)
nodes = retriever.retrieve("query")


This will be the same as accessing response.source_nodes if you did index.as_query_engine(similarity_top_k=2, ..)
Awesome, thank you, I will experiment with this. This also means I don't need to make another api call right?
right, it will be one API call to embed the query text (which is fast) and another to retrieve from the index (which is also fast)
Add a reply
Sign up and join the conversation on Discord