Hi there! Is there any way to get the

Hi there! Is there any way to get the corresponding nodes BEFORE the call to OpenAI is made? Currently, I'm doing this:

Plain Text

query_engine = index.as_chat_engine(chat_mode='context', 
                                                similarity_top_k=similarity_top_k, 
                                                llm=llm_engine,
system_prompt=prepared_system_prompt))
response = query_engine.chat(query_text, chat_history=chat_history)

Thanks!

22 comments

LLogan M

You could use a custom node postprocessor. Or run retrieval outside of your chat engine

SSeaCat

May I ask you, how to run retrieval? any example? like this one?

Plain Text

nodes = index.as_retriever().retrieve("test query str")

LLogan M

yea exactly (you can also pass in the similarity_top_k here as well)

SSeaCat

Cool, thanks! One more question, if you mind. How can I pass these nodes to query_engine.chat later (to avoid double retrieveing)?

LLogan M

hmmm, I don't think you can.

Probably, if you want to intercept these nodes, I would use a custom node-postprocessor instead

LLogan M

Otherwise, you'd have to define your own custom chat engine

SSeaCat

I found this way, do you think it would work?

Plain Text

nodes = index.as_retriever(similarity_top_k=similarity_top_k).retrieve(query_text)
context_str = "\n\n".join([n.node.get_content() for n in nodes])
full_prompt = system_prompt + 'Below is the provided context: \n\n' + context_str
chat_history.append(ChatMessage(role="system", content=full_prompt))
chat_history.append(ChatMessage(role="user", content=query_text))
response = llm_engine.chat(chat_history)

The main question is is it the same as the original code?

Plain Text

query_engine = index.as_chat_engine(chat_mode='context', 
                                                similarity_top_k=similarity_top_k, 
                                                llm=llm_engine,
system_prompt=prepared_system_prompt))
response = query_engine.chat(query_text, chat_history=chat_history)

LLogan M

Its roughly the same as what the chat engine is doing. The main thing is in your custom version, you need to manage the chat history (either using a memory module, or however you want to manage that)

SSeaCat

Thanks! I will look at the memory module.

SSeaCat

Just a small clarification... were you talking about ChatMemoryBuffer or something else?

LLogan M

Yes thats what I meant

SSeaCat

Thanks, gotcha!

SSeaCat

But I can't figure out how to use it in this case. When using query_engine, I just pass it to as_chat_engine, but where I should pass it in my case and how to connect to chat_history?

LLogan M

So, the memory is there as way to manage the chat history, mostly so that it doesn't get too big.

So for example, I might do

Plain Text

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

...

system_message = ChatMessage(role="system", content=full_prompt)
user_message = ChatMessage(role="user", content=query_text)

prev_messages = memory.get()

response = llm.chat([system_message, *prev_messages, user_message])

memory.put(user_message)
memory.put(response.message)

This way, the chat history is included in each llm.chat() call (up to the token limit), rather than just the most recent message + context

SSeaCat

Ahhh interesting thanks! I will try to use this approach

SSeaCat

Hi! One more question on this piece of code. What exactly does token_limit here? Thanks!

Plain Text

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

LLogan M

Its limiting how many tokens the messages can use when running memory.get() -- it will fetch as many of the latest messages that fit into that limit

SSeaCat

Well, I'm asking because I noticed even though I pass a pretty big number (like 10,000) there, the context is very small for some reason. I have a feeling it cuts the system message a lot.

LLogan M

🤔 It has nothing to do with the system message, since you are creating that outside of the memory

LLogan M

system_message = ChatMessage(role="system", content=full_prompt)

LLogan M

you can change your retriever to retrieve more or less when creating that

SSeaCat

Okay, I see it may be my own bug 🤦‍♀️

Add a reply

Find answers from the community

Hi there! Is there any way to get the