Find answers from the community

Updated 2 months ago

ContextChatEngine

Need your help again!! 🥺

So I have built this llamaindex based app and used FastAPI to create the relevant APIs.

My app has a feature to chat with the video transcript. Now for this I have exposed 2 APIs - one to return the ContextChatEngine object and the other to return a response whenever the user types the query, with this object by calling query_engine.query().

But I am not able to return this ContextChatEngine object because the ContextChatEngine class is not serializable/deseriazable.Caling the API is throwing "TypeError: cannot pickle 'builtins.CoreBPE' object" and creating a custom response class is throwing an error too. Any idea how to fix this?
E
r
L
9 comments
hmm, why do you need to return the ContextChatEngine?
could you provide some snippet?
So the experience is like this - The moment you submit the video link, I create the ContextChatEngine object by calling the "/get_chat_engine" endpoint and then on subsequent submissions of query inputs I call the "/chat" endpoint. This is the code :-

@app.get("/get_chat_engine", response_model=None)
def get_chat_engine(yt_video_link: str):
index = initialize_index(yt_video_link)

retriever = VectorIndexRetriever(
index=index,
similarity_top_k=2,
)
response_synthesizer = get_response_synthesizer(
response_mode='tree_summarize', use_async = True, streaming = True)

system_prompt = f""" You are a friendly and helpful mentor whose task is to \
use ONLY the context information and no other sources to answer the question being asked.\
If you don't find an answer within the context, SAY 'Sorry, I could not find the answer within the context.' \
and DO NOT provide a generic response."""

chat_engine = ContextChatEngine.from_defaults(system_prompt = system_prompt, retriever = retriever, response_synthesizer = response_synthesizer)
return chat_engine


@app.get("/chat")
def chat(chat_engine: ContextChatEngine, query: str):
response_stream = chat_engine.stream_chat(query)
return StreamingResponse(response_stream.response_gen)
I think you will have a hard time properly serializing the chat engine or an index.

You could either serialize the main settings needed to re-construct it on the other end, or provide that chat interface itself over the api

Tbh the second option seems like a better design
Didn't understand the second option. What do you mean by "providing the chat interface" over the API?
Also is this a FastAPI restriction, or will I have this problem with any such API framework?
I think you will have this problem with any API framework

By providing the chat interface over the API, I mean the backend should probably be managing all the converstations/engines. Then you could have API endpoints like
@app.post(/chat/{user_id}) -- where messages get posted to a specific chat engine, managed by some ID? 🤔
Ohh so you mean bringing in a storage layer/DB to store the index/engine objects by user_id or something else, right?
yea possibly! or it could even be managed in-memory depending on the scale

Basically, serializing this stuff is pretty hard. It's probably better to cache active convserations in memory, and long term load/re-load the index and chat engine after a certain period of inactivity

Otherwise, you'll need to send over information needed to re-construct the chat engine on the client side
Add a reply
Sign up and join the conversation on Discord