You can do it directly now

At a glance

You can do it directly now

chat_engine = index.as_chat_engine()
default value is CondenseQuestion mode only

28 comments

can't make it work either, i get returned a 500 error

query_text = data.get("Prompt")
    query_engine = index.as_chat_engine(similarity_top_k=3, text_qa_template=qa_template)
    response = query_engine.chat(query_text)

VVaylonn

I mght need to give him a chat history and pass it in parameter ?

VVaylonn

File "c:\Projets\IA Chat Local\Sources\AzureOpenAI\app.py", line 91, in get_json
    response = query_engine.chat(query_text)

WWhiteFang_Jr

No, It prepares the chat history by itself. If not already present

WWhiteFang_Jr

Can you share the entire error

WWhiteFang_Jr

This seems more like OpenAI error

WWhiteFang_Jr

Try passing the service context in the chat engine once

WWhiteFang_Jr

Plain Text

llm_predictor = LLMPredictor(llm=ChatOpenAI(openai_api_key="YOUR_API_KEY",temperature=0, max_tokens=1024, model_name="gpt-3.5-turbo"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, chunk_size_limit=512)

Try with this once

VVaylonn

just passed the service context in query_engine = index.as_chat_engine(similarity_top_k=3, text_qa_template=qa_template, service_context=service_context)

VVaylonn

and that maked it work

VVaylonn

nice

VVaylonn

However it seems a little shaky with a chat history

LLogan M

Just a note, you can also set a global service context so that you don't have to worry about passing it in everywhere

Plain Text

from llama_index import set_global_service_context

set_global_service_context(service_context)

<continue with program>

WWhiteFang_Jr

You passing it ?

VVaylonn

nope only via the setting chat_engine

VVaylonn

but when i ask him questions like "what the question i just asked you before" i tells me that he doesnt know

VVaylonn

He should know no ?

LLogan M

Hmm, I think that's a sympotom of how the condense engine works

LLogan M

We probably need a more simple implementation that doesn't re-phrase the query every time

LLogan M

Every input it uses the chat history to re-write the user query, and uses that to search

LLogan M

the react-agent is probably more along the lines of what you want

LLogan M

but thats basically just langchain

WWhiteFang_Jr

Yes it will not be able to tell you these questions
Actually the way condense mode works is

There are two llm calls.
First one forms the questions based on user query which is asked to our indexes.

So if you ask what was the last question, it will pick the last question from the chat history but that question will be used in the second llm call thus you'll get the response based on the last question ans not the actual question

VVaylonn

What bothered me with the react engine is (i guess) that he have the possibility to chose beteween the index and his own knowledge. Since I want it to be restrained to only my vector index, I think this solution can't be exploited...

VVaylonn

At the end I just want to track the chat history and put it in my json file. It's just a chatbot.. If he can query over last questions it's good but the main thing i'm working on rn is to keep track of that chat history then giving my engine the template and the chat history (which i don't know how to implement it yet)

VVaylonn

Found out how to format my output in json

VVaylonn

need to implement the history now !

Add a reply

Find answers from the community

You can do it directly now