Find answers from the community

Updated 6 months ago

Ok next issue 😊 trying to get streaming

At a glance

Ok next issue 😊 trying to get streaming to work. Im using a langchain llm (HuggingFaceTextGenInference) and the streaming works from my inference endpoint. However, when using it with llama_index I get error “LLM must support streaming”

8 comments

LLogan M

wow thats annoying lol

Our streaming looks for an attribute called "streaming", but that particular LLM uses an attribute called "stream" 🤦‍♂️

LLogan M

One quick workaround is this

Plain Text

class LlamaHFTextGen(HuggingFaceTextGenInference):
    streaming: bool = True

LLogan M

Basically override the langchain class to add that parameter

LLogan M

75% sure that will work lol

bbig_ol_tender

Thanks! I actually just got it to work- I set streaming = True in the huggingfacetextgeninference class and NOWHERE else!

LLogan M

well, that works too! lol

bbig_ol_tender

Yeah, and it streams for both retrieverqueryengine and index.as_chat_engine so far

bbig_ol_tender

About to try others

Add a reply