Hey Has anyone experience with the

At a glance

The community member is experiencing slow retrieval speed, about 1 minute per query, when using the AutoMergingRetriever with a Pinecone vector database. Other community members suggest trying to use the base vector store instead of Pinecone to see if that improves the speed. They also mention that embedding can take a while if there are a lot of documents, and provide a link to a Discord channel with tips to speed it up. The community members discuss potential reasons for the slow speed, such as latency with the Pinecone index, and the community member says they will try to work out a solution.

Useful resources

DDrSebastianK

Hey! Has anyone experience with the retrieval speed of the AutoMergingRetriever? It is taking for me about 1 min/query, with pinecone vector db. I am wondering if it's only for me so slow.

14 comments

LLogan M

Is it the retrieval speed or the overall query/synthesis speed?

DDrSebastianK

It is the retrieval speed. Retriever engine takes about 1 min, to retrieve nodes, query engine is almost exactly the same

LLogan M

Does the speed improve if you just use the base vector store rather than pinecone?

LLogan M

(Just narrowing down the cause)

DDrSebastianK

Give me a second to test it

DDrSebastianK

10 min and the embedding will finish. What is your experience? Is it below 10s?

LLogan M

embedding can take a while if you have a lot of documents, but you can speed it up

see here
https://discord.com/channels/1059199217496772688/1147202918987071549/1147203675010375680

DDrSebastianK

I changed batch size to 2000 and run it on cuda

DDrSebastianK

Btw. without pinecone db the retrieval time is 6 sec

DDrSebastianK

which is nice

DDrSebastianK

What do you think the reason could be? I am using the llama_index pinecone wrapper. With simple retrieval it was working super fast before.

LLogan M

Hmm, tbh I'm not 100% sure how this retriever works, I haven't looked at the code yet lol

Lets see if the source code reveals anything

LLogan M

Hmm nothing really that special, my only guess is there is some latency with the pinecone index 🤔 Pinecone is only used once during the initial retrieve

initial_nodes = self._vector_retriever.retrieve(query_bundle)

DDrSebastianK

Okay, I will try to work out something. Thanks!

Add a reply

Find answers from the community

Hey Has anyone experience with the