After the last update, in a new project

After the last update, in a new project with Azure openAI 4o-mini with pinecone, when using

vector chat_engine = vector_index.as_query_engine(chat_mode="condense_plus_context",llm=azure-gpt4o-mini, memory=redis_memory, context_prompt="""Your name is John.... {context_str}...""",query_str=user_input, filters=combined_filters, similarity_top_k=3, verbose=True)

The context_prompt is "being ignored" but context_str is being passed.
When asked "What is your name?" it does not know (it gets context from the index - used "Alice in wonderland" as example, and uses it to answer, not the "You are John" as instructed).

hello, what is your name?
I’m afraid I can’t explain myself, as I’m not quite sure who I am at the moment.

I have tried with GPT4o (not mini) and get the same behavior. Anyone with any similar issue? I have the same code in another version and I believe I never had these issues.

10 comments

LLogan M

Rather than abusing kwargs, why not create the chat engine directly so that you know things are being passed properly?

Plain Text

from llama_index.core.chat_engine import CondensePlusContextChatEngine

chat_engine = CondensePlusContextChatEngine.from_defaults(index.as_retriever(similarity_top_k=3, filters=filters), ....)

Attachment

PPedroV

chat_engine.update_prompts({"response_synthesizer:text_qa_template": qa_prompt_tmpl})
was needed to update the prompt, otherwise was using the default one...

LLogan M

condense_plus_context does not use that prompt template 👀

LLogan M

thats would only be for query engines 🤔 Or things that use query engines

PPedroV

chat_engine = vector_index.as_query_engine(...
chat_engine.update_prompts({"response_synthesizer:text_qa_template": qa_prompt_tmpl})

LLogan M

oh thats confusing lol

LLogan M

chat_engine = vector_index.as_query_engine(...)

LLogan M

the variable name is chat engine, but you called index.as_query_engine() 😵‍💫

LLogan M

gotcha

PPedroV

I do it like
#pinecone metadata filters
combined_filters = ...
#set pinecone index
vector_store = PineconeVectorStore(pinecone_index=pcindex)
vector_index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
#set chat memory
chat_memory = ChatMemoryBuffer.from_defaults(...
#set chat engine
chat_engine = vector_index.as_query_engine(...
#query
reply = chat_engine.query(user_input)

If I use a reranker, it also has its step and its added to the chat_engine.

Want to give me a different approach suggestion?

Add a reply

Find answers from the community

After the last update, in a new project