I'm having an issue with llama index response time in context chat mode. I am using a Database reader to read a row of sel which contains a very long text that is not very well formatted as well. It should have been a regular pdf/text document honestly. Does that make the performance a lot worse because it seems like it does? Sometimes I just wait indefinitely for a response. Is there a way to set some time limit for the retrieval and just give me some answer in the chat.
I don't know if the system prompt or chat memory make it bugged somehow. I've manually set a chat memory with 3000 tokens. I've also noticed that without a system prompt (or a different one) retrieval works better.
@Logan M biggest issue is the insane waiting time. I dont know if it does something or its stuck as I havent been able to wait until the end. Tried updating to latest stable version but it still happens from time to time