Hi all, I'm currently setting up a query engine in combination with a large vector database (Qdrant hosted in the cloud). Therefore, retrieval can take longer than 30 seconds. In this case, sync and async queries raise an error. I set in every possible subcomponent (embed_model, llm, query_engine, index, client) timeout / request_timeout > 30. How can I increase the retrieval time? Trace: query |_CBEventType.QUERY -> 30.18417 seconds |_CBEventType.RETRIEVE -> 30.18417 seconds |_CBEventType.EMBEDDING -> 0.155964 seconds
UnexpectedResponse: Unexpected Response: 504 (Gateway Time-out) Raw response content: b"<html><body><h1>504 Gateway Time-out</h1>\nThe server didn't respond in time.
Thanks for the answers. It turned out the error was twofold: On the one hand, I had to increase the timeout setting of the server where Qdrant is deployed. The second problem was that when uploading lots of data in Qdrant, it can happen that not all the uploaded vectors are directly indexed, and it might take some time. When this happens, the search can take up a lot of time. Normally, I can simply set the option index_only=True in qdrant Client, to search only within the indexed vectors: search_params=SearchParams(indexed_only=True). When using the Qdrant client from Llama-index with a search query, somehow the parameter is not passed down to the underlying qdrant client by the kwargs! I tried it in several components, but it did not work. My solution was to hard code it in the qdrant library, until all vectors were indexed. I will try next week to reproduce the bugg.
I guess retrieval took so long because indexing took quite a while. I store in the vector db ~ 8 million records consisting of Json dictionaries containing up to 30 features...