Find answers from the community

Updated 2 months ago

Speed

I am using use_async true already but, here is the output... if I was to use embedding would things be faster?
Plain Text
langchainapi-langchain-1  | 17-Jun-23 19:28:00 - > Building index from nodes: 1 chunks
langchainapi-langchain-1  | > Building index from nodes: 1 chunks
langchainapi-langchain-1  | 17-Jun-23 19:28:05 - message='OpenAI API response' path=https://api.openai.com/v1/completions processing_ms=4881 request_id=64146b50553b7ef6b43bc8e7e21a30ac response_code=200
langchainapi-langchain-1  | message='OpenAI API response' path=https://api.openai.com/v1/completions processing_ms=4881 request_id=64146b50553b7ef6b43bc8e7e21a30ac response_code=200
langchainapi-langchain-1  | 17-Jun-23 19:28:06 - message='OpenAI API response' path=https://api.openai.com/v1/completions processing_ms=5480 request_id=935cbe4f2158adcb864e902a03a424d1 response_code=200
langchainapi-langchain-1  | message='OpenAI API response' path=https://api.openai.com/v1/completions processing_ms=5480 request_id=935cbe4f2158adcb864e902a03a424d1 response_code=200
langchainapi-langchain-1  | 17-Jun-23 19:28:11 - > [get_response] Total LLM token usage: 508 tokens
langchainapi-langchain-1  | > [get_response] Total LLM token usage: 508 tokens
langchainapi-langchain-1  | 17-Jun-23 19:28:11 - > [get_response] Total embedding token usage: 0 tokens
langchainapi-langchain-1  | > [get_response] Total embedding token usage: 0 tokens
langchainapi-langchain-1  | 17-Jun-23 19:28:11 - > [get_response] Total LLM token usage: 6311 tokens
langchainapi-langchain-1  | > [get_response] Total LLM token usage: 6311 tokens
langchainapi-langchain-1  | 17-Jun-23 19:28:11 - > [get_response] Total embedding token usage: 0 tokens
langchainapi-langchain-1  | > [get_response] Total embedding token usage: 0 tokens
L
b
6 comments
You mean if you used a vector index would it be faster? Depends. The main limit here is you made 3 LLM calls it looks like
that's probably just the summarizing of the documents right?
the reason I ask about embedding because if I understand correctly, since, we might ask to summarize these same documents multiple times with just different prompts if the documents were embedded and saved in pinecone or whatever, wouldn't it be faster?
Yup, that's what it is
Mmm not really. At the end of the day, the LLM needs to read the entire document to accurately create a summary (even if it's just the prompt that changed)
Add a reply
Sign up and join the conversation on Discord