Find answers from the community

Updated 2 years ago

Higgingface LLM

At a glance

A community member is trying to use the HuggingFaceHub from LangChain along with LLamaIndex, but the solution shared on GitHub by another community member is not working for them. They are getting an "Empty Response" output and ask the other community member for help. The other community members suggest checking the prompt helper settings, specifically the context window and chunk size, as the Falcon model may have limitations on these values. They recommend trying a context window of 2048 and a chunk size of 512.

Useful resources
@Logan M I'm trying to use the HuggingFaceHub from LangChain along with LLamaIndex. The solution that you shared on GitHub, I tried it but somehow it's not working for me. Can you please help me out?

Here's your github response:
https://github.com/jerryjliu/llama_index/issues/3290#issuecomment-1546914037
L
d
11 comments
Ah right right. What's the error you are getting with this setup?
Umm it's not throwing any error but just outputting "Empty Response"
Hmm. I'm not 100% sure how huggingface hub works. What are your prompt helper settings?
Normally on local hughingface models you'd set max_tokens or max_new_tokens, not sure what it defaults to doing πŸ˜…
Well I can send you once I'm back in front of my pc πŸ˜…
So here's my prompt helper:

Plain Text
prompt_helper = PromptHelper(
     context_window=context_window,
      num_output=num_output,
      chunk_size_limit=chunk_size_limit
   )
But what are the values being used there? πŸ‘€
Plain Text
context window=4096
num_output=256
chunk_size_limit=2048
Ooo there's one issue. I think falcon only supports a context window of 2048.

Try setting the context window to 2048, and the chunk size to 512 maybe?
Also put chunk_size=512 directly in the service context, to also set the node parser to use that chunk size as well
Okay, will do that then πŸ˜€
Add a reply
Sign up and join the conversation on Discord