Find answers from the community

Updated 7 months ago

Does anyone know how i can get the last

Does anyone know how i can get the last token of stream_chat()?
E.g i am sending the results of stream_chat() as a json over FastAPI to a react frontend, and would like to have a flag for is_done. However, when checking for the is_done flag in the StreamingChatResposne, the flag is set to True when it is NOT the last token, while i am still iterating and sending the response . I'm guessing that this is because of the lag between the time when Ollama finishes it's response and sets the flag to when i am actually checking the flag.

Is there anyway i can check for the last token/response generated?

code extracts as follows:
Plain Text
async def astreamer(response,model_used):
    try:
        for i in response.response_gen:
            if response._is_done:
                print("IS DONE!")
            else:
                print("IS NOT DONE!")
            yield json.dumps(i)
            create_json_response()
            await asyncio.sleep(.1)
    except asyncio.CancelledError as e:
        print('cancelled')

Plain Text
@app.post("/chat")
async def chat(request:Request):
  ...
  response = chat_engine_dict["engine"].stream_chat(query)
  return StreamingResponse(astreamer(response,model_used=model_used),media_type="text/event-stream")
L
B
7 comments
The is_done flag isn't entirely controlling iteration
Is there any way for me to check for the last token generated?
Oh hmm, maybe this condition should be an and
Attachment
image.png
nvm that if statement is right, the nots are just confusing
if not_done or not_empty
Not really sure why the last token is missing for you, i've never had that issue with ollama
this loop iterates until both done and empty -- if done is set, theres no way for anything further to be added to the queue
Add a reply
Sign up and join the conversation on Discord