hey, would you mind sharing the code ?
storage_context = StorageContext.from_defaults(persist_dir=f"{product_code}_llama")
index = load_index_from_storage(storage_context)
engine = index.as_chat_engine(
chat_mode="context",
verbose=True,
system_prompt=prompt,
)
And these are my settings:
Settings.llm = OpenAI(model = GPT_MODEL, temperature = 0.0, max_tokens = 3000)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-large")
Settings.chunk_size = 256
Settings.chunk_overlap = 64
How I create the chat_history:
prior_conv.append(
ChatMessage(
role=MessageRole.USER, content=START_CONTENT + question + END_CONTENT
)
)
prior_conv.append(
ChatMessage(
role=MessageRole.ASSISTANT, content=START_CONTENT + answer + END_CONTENT
)
What error are you facing, can you share that as well
No error, it just won't answer follow-up questions anymore, while my prompt didn't change
You can check if follow up questions are bringing correct nodes or not
I'm not sure what you mean by "it won't follow up" ?
How are you calling the engine? Are you passing in that cht history list?
Ok let's say I ask the chatbot how to upload data. Then it gives me a 5 step plan like step 1 clikck here step 2 do this, et cetera. Now let's say step 3 is a little unclear to me, then as a follow-up question I might ask: "Can you explain step 3 a little more clearly?". And this breaks with the new embedding.
This is how I call the chat engine: response = query_engine.chat(question, chat_history=prior_conv)
Should the final nodes used for the answer include the provided chat_history?
Anything obvious I'm missing or do you think the problem is more nuanced?
@WhiteFang_Jr @Logan M Could it be that I have to implement ChatMemory into my code to get the chat functionality to work with the newer llama-index versions? I currently manage that manually
Seems like however you manage it manually might be buggy? Both approaches should work fine
The way I manage it manually is by just taking the 3 most recent messages. Anything beyond that gets cut-off. That currently works, but when switching to a the new embedding model and using Settings, it all of a sudden doesn't work anymore and it's really frustrating.
@Torsten possible to make a colab notebook to reproduce? I'm sure it's an easy fix