Find answers from the community

Updated 2 months ago

Optimizing Chat Engine Response Times

At a glance

The post asks about the arguments the chat method of a chat engine takes and how to make the response faster. The comments suggest that the main requirement for querying is the query itself, and optionally the chat history. Community members also recommend using streaming to get the response faster, without waiting for the entire response to generate. They provide example code for streaming the response. Additionally, community members mention that the response time depends on the hardware if using an open-source language model. They also suggest referring to the documentation for more information on streaming support and accessing custom prompts.

Useful resources
What are the arguments chat method of chat engine takes and how can I make it faster to give me response
@WhiteFang_Jr @Logan M
W
P
L
10 comments
for querying, the prime requirement is the query along with this you can also pass the chat history if you want
If you are using open source llm then the repsonse time totally depends on your hardware
You can also try streaming the response that way you dont have to wait for the entire response to generate
How to stream the response from the chat engine
Plain Text
resp = chat_engine.stream_chat(...)
for r in resp.response_gen:
  print(r, end="", flush=True)


Or async
Plain Text
resp = await chat_engine.astream_chat(...)
async for r in resp.async_response_gen():
  print(r, end="", flush=True)
How do I display this in chat bot , print statements cannot be displayed, right?
@WhiteFang_Jr
Thanks @WhiteFang_Jr But this doesn’t seem to have custom prompt to be fed. Can we give our custom prompt to this?
Add a reply
Sign up and join the conversation on Discord