Find answers from the community

Updated 3 months ago

Hello

Hello!
A very quick question.
I'm using a chat engine, here is my code:
Plain Text
query_engine = index.as_chat_engine(verbose=True,chat_mode="context",
                system_prompt=system_prompt)
response = query_engine.chat(query_text, chat_history=chat_history)

The question is - how big the prompt can be here? Are there any limitations to its length, or some numbers when a chatbot stops "thinking" properly, or there is nothing like that?
Thanks! You created a great product!πŸ₯°
L
S
6 comments
so assuimg default settings, you are retrieving 2 nodes that are 1024 tokens on each message. This gets inserted into a template + the system prompt you passed in, so lets call it 1100 tokens.

With gpt-3.5, this means you have 4096-1100 = 2996 tokens left

Then, the chat history length which you are passing in could be any length. Let's see it's a max of 1500 tokens.

So now we have 2996 - 1500 = 1496 tokens left for the prompt
hope that makes sense!
Thanks! Interesting... about the number of nodes, in the response objects I always see 5 nodes or less, not sure why.
And the second question is - what happens if the length of the prompt is more than the available number of tokens, will it be truncated? Is it possible to control it somehow? Thank you
Right now I think it will just crash πŸ˜… need to handle that on your end (limiting the history, or truncating the prompt, etc)
Ahhh that could be dangerous 😬 How do I know the total number I'm going to send, or a total number of left tokens, any information to prevent it?
Add a reply
Sign up and join the conversation on Discord