Find answers from the community

Updated 4 months ago

Hello

At a glance

Hello!
A very quick question.
I'm using a chat engine, here is my code:

Plain Text

query_engine = index.as_chat_engine(verbose=True,chat_mode="context",
                system_prompt=system_prompt)
response = query_engine.chat(query_text, chat_history=chat_history)

The question is - how big the prompt can be here? Are there any limitations to its length, or some numbers when a chatbot stops "thinking" properly, or there is nothing like that?
Thanks! You created a great product!🥰

6 comments

LLogan M

so assuimg default settings, you are retrieving 2 nodes that are 1024 tokens on each message. This gets inserted into a template + the system prompt you passed in, so lets call it 1100 tokens.

With gpt-3.5, this means you have 4096-1100 = 2996 tokens left

Then, the chat history length which you are passing in could be any length. Let's see it's a max of 1500 tokens.

So now we have 2996 - 1500 = 1496 tokens left for the prompt

LLogan M

hope that makes sense!

SSeaCat

Thanks! Interesting... about the number of nodes, in the response objects I always see 5 nodes or less, not sure why.

SSeaCat

And the second question is - what happens if the length of the prompt is more than the available number of tokens, will it be truncated? Is it possible to control it somehow? Thank you

LLogan M

Right now I think it will just crash 😅 need to handle that on your end (limiting the history, or truncating the prompt, etc)

SSeaCat

Ahhh that could be dangerous 😬 How do I know the total number I'm going to send, or a total number of left tokens, any information to prevent it?

Add a reply