Find answers from the community

Updated 3 months ago

Several Questions (sorry):

Maxx · 2023-11-08T15:15:47.434Z

Several Questions (sorry):I need to bring my index retrieval time down to less than 10 seconds consistently. Is this even possible?I switched from in-memory vector store to Redis Vector Store which helped a lot with speed but it sometimes still takes upwards of 20 seconds. Is this just a feature of using redis or is there a good chance I am doing something wrong? My 3 indexes have ~2000, 1, and 1 document in them respectively, but even the single document sometimes can timeout.If not Redis, is there another fast, free external vector DB I could try?

MMaxx

I need to bring my index retrieval time down to less than 10 seconds consistently. Is this even possible?
I switched from in-memory vector store to Redis Vector Store which helped a lot with speed but it sometimes still takes upwards of 20 seconds. Is this just a feature of using redis or is there a good chance I am doing something wrong? My 3 indexes have ~2000, 1, and 1 document in them respectively, but even the single document sometimes can timeout.
If not Redis, is there another fast, free external vector DB I could try?

5 comments

LLogan M

are you sure you are talking about retrieval time? Or full synthesis time?

Usually retrieval takes 1s or less, but generating a response with that retrieved text can take 3s+

MMaxx

I am using the response from the index as part of my prompt for a gpt response. Is there a way to cut out the synthesis step?

LLogan M

If you just want retrieval, you can do

Plain Text

retriever = index.as_retriever(similarity_top_k=2, ...)
nodes = retriever.retrieve("query")

This will be the same as accessing response.source_nodes if you did index.as_query_engine(similarity_top_k=2, ..)

MMaxx

Awesome, thank you, I will experiment with this. This also means I don't need to make another api call right?

LLogan M

right, it will be one API call to embed the query text (which is fast) and another to retrieve from the index (which is also fast)

Add a reply