The community member has built a llama-index based app using FastAPI and has exposed two APIs - one to create a ContextChatEngine object and another to return a response when the user queries. The community member was initially trying to return the ContextChatEngine object as a response to the POST call, but this was throwing a "TypeError: cannot pickle 'builtins.CoreBPE' object" error as the object is not serializable/deserializable.
The community member then tried to store the ContextChatEngine object in-memory, but storing it in JSON also threw the same error. The community member tried using global variables to store the object, but this didn't work in API calls. The community member then discovered that the issue was because the uvicorn workers weren't sharing state among themselves, and the state of the global variable wasn't being shared, leading to the error.
The answer provided by the community members is that it's better to put globals on the FastAPI app object, rather than using global variables directly.
So I have built this llama-index based app and used FastAPI to create the relevant APIs. My app has a feature to chat with the video transcript. Now for this I have exposed 2 APIs - one POST API to create the ContextChatEngine object and the other to return a response whenever the user types the query, with this object by calling query_engine.query().
Now earlier I was trying to return the ContextChatEngine object as a response to the POST call but because the ContextChatEngine object is not serializable/deseriazable calling the API is throwing "TypeError: cannot pickle 'builtins.CoreBPE' object”, you suggested I store the object in-memory. However trying to store the ContextChatEngine object in JSON is throwing the same error (can’t be serialised/deserialised).
For reference my code is :- def create_chat_engine(yt_video_link: str, session_id: str): index = initialize_index(yt_video_link)
system_prompt = f""" You are a friendly and helpful mentor whose task is to \ use ONLY the context information and no other sources to answer the question being asked.\ If you don't find an answer within the context, SAY 'Sorry, I could not find the answer within the context.' \ and DO NOT provide a generic response."""
Any ideas on how can I store the ContextChatEngine object and use it every time the user sends a query. I feel like I need to look at it some other way and need some more llama-index information surrounding ChatEngines.
Global variables isn't working in API calls. I had originally done that only, creating a global dict and storing the object per session_id. The first POST API call is creating the chat_engine and the next is retrieving and using it to chat.
The code for your reference :-
app = FastAPI() chat_engines_dict = {}
@app.post("/create_chat_engine", response_model=None) def create_chat_engine(yt_video_link: str, session_id: str): index = initialize_index(yt_video_link) retriever = VectorIndexRetriever( index=index, similarity_top_k=2, ) response_synthesizer = get_response_synthesizer( response_mode='tree_summarize', use_async = True, streaming = True) system_prompt = f""" You are a friendly and helpful mentor whose task is to \ use ONLY the context information and no other sources to answer the question being asked.\ If you don't find an answer within the context, SAY 'Sorry, I could not find the answer within the context.' \ and DO NOT provide a generic response.""" chat_engine = ContextChatEngine.from_defaults(system_prompt = system_prompt, retriever = retriever, response_synthesizer = response_synthesizer) global chat_engines_dict chat_engines_dict[session_id] = chat_engine
@app.get("/chat") def chat(query: str, session_id: str): global chat_engines_dict chat_engine = chat_engines_dict[session_id] (<---- this is returning None)
So it turns out it was happening because the uvicorn workers weren't sharing state among themselves. Hence the state of the global variable wasn't being shared and hence the error. @Logan M Thanks for lending your time and support Logan! You are amazing. ❤️