Find answers from the community

Updated 3 months ago

Ok next issue ๐Ÿ˜Š trying to get streaming

Ok next issue ๐Ÿ˜Š trying to get streaming to work. Im using a langchain llm (HuggingFaceTextGenInference) and the streaming works from my inference endpoint. However, when using it with llama_index I get error โ€œLLM must support streamingโ€
L
b
8 comments
wow thats annoying lol

Our streaming looks for an attribute called "streaming", but that particular LLM uses an attribute called "stream" ๐Ÿคฆโ€โ™‚๏ธ
One quick workaround is this

Plain Text
class LlamaHFTextGen(HuggingFaceTextGenInference):
    streaming: bool = True
Basically override the langchain class to add that parameter
75% sure that will work lol
Thanks! I actually just got it to work- I set streaming = True in the huggingfacetextgeninference class and NOWHERE else!
well, that works too! lol
Yeah, and it streams for both retrieverqueryengine and index.as_chat_engine so far
About to try others
Add a reply
Sign up and join the conversation on Discord