Hi back @Logan M , I've just tried to implement the CondensePlusContextChatEngine, but, when using summary index retriever with a large amount of documents, it shows me a token limit error
Oh yea... its not ideal to use with a summary index π Since it will put all retrieved nodes into a single LLM call (and a summary index retrieves ALL nodes π¬ )
To change the question just by ramaking it without taking Care of the context/with only the documents maybe in context, but not the chat, or without it influencing to much the query
also, what is the best practice to just generate a chat engine which we ask to choose between multiple query engines, and have a chat memory, but does not remake the sentence of the queries