@Logan M While using llamaIndex for a

vvwo pvt ltd

@Logan M While using llamaIndex for a chat engine, is there a way to use one query to retrieve a node and then a different query/prompt to generate the response - in the same call.

10 comments

LLogan M

yes

LLogan M

ah wait

LLogan M

for a chat engine, its a little harder. An agent would do the above though (i.e write a query, and then interprets the response)

vvwo pvt ltd

Okay

vvwo pvt ltd

@Logan M Also I have one more question in which i need your help, currently my retrieve is taking around 3.5-4.5 seconds to retrieve the chunk, how can i reduce this time.

vvwo pvt ltd

should i change to different types of index or going to pinecone for retrieval process will works for me

vvwo pvt ltd

Currently i am using it like this :

def build_vector_and_creating_insights_index_store_new(company):
storage_context_map[f"""{company}/html_index"""] = StorageContext.from_defaults(persist_dir=f"""client_data/{company}/html_index""")
index_map[f"""{company}/html_index"""] = load_index_from_storage(storage_context_map[f"""{company}/html_index"""], service_context=get_service_context(company))
curr_index = index_map[f"""{company}/html_index"""]
return curr_index

index = build_vector_and_creating_insights_index_store_new("vwo")
retriever = index.as_retriever(
retriever_mode="llm",
choice_batch_size=1,
)
nodes = retriever.retrieve(req.query)

vvwo pvt ltd

please suggest me some best optimisations

LLogan M

definitely using something like pinecone, qdrant, weaviate will be faster

vvwo pvt ltd

sure thank you @Logan M

Add a reply

Find answers from the community

@Logan M While using llamaIndex for a