Find answers from the community

Updated 3 months ago

Hi Anyone know how the llamaIndex's chat

Hi Anyone know how the llamaIndex's chat engine works: specifically, does it query the index for each user interaction and then use configured llm to produce a response or does it figure out if the answer to a new user query is contained in the chat history (including any contexts queried from the index previously)?
L
e
9 comments
Got it thanks, this was really helpful. I was using "openai" mode thinking that the index will be queried for context and then openai llm model used to synthesize the response. But look like that is not 100% accurate and maybe I should be using condense_plus_context.
Thanks for the quick reply
Yea! openai is basically an agent -- it will decide to respond directly, or to use query engine to help it respond πŸ™‚

condense_plus_context will always use the index for information. This will be much faster (less LLM calls), but maybe a little less customizable. Trade-offs with each mode I suppose
Got it. actually I had another question not pertaining to index but regarding UnstructuredElementNodeParser -- It specifically turn off the use of Embedding model is there a reason for that do you know?
Since its a summary index, it doesnt use embeddings at all.

Without setting that to None, it will default to initializing openai embeddings, which will raise an error if a user doesn't have an API key set
So, nothing to worry about
Okay got it. thanks for the clarifications.
Add a reply
Sign up and join the conversation on Discord