Find answers from the community

Updated 7 months ago

Spaces - Hugging Face

At a glance

The community member is trying to use the HuggingFaceInferenceAPI through llamaindex, but is encountering an error when trying to access the llama model. The error message indicates that the model is too large to be loaded automatically and suggests using Spaces or Inference Endpoints. The community member has tried using different models (mistral, zephyr, llama2, llama3) and generating different types of tokens, but the issue persists.

The community member also mentions another issue they are facing with LlamaParse, where they are getting a "ParseError error tokenizing data. C error: EOF inside string starting at row 0" error when trying to parse PDF documents into markdown. They believe this is a bug in LlamaParse and would like to connect with a developer to help fix the issue.

The community members in the comments try to provide suggestions and troubleshoot the issues, but there is no explicitly marked answer.

Useful resources
hey, i am trying to use the HuggingFaceInferenceAPI through llamaindex. It works fine for mistral but for llama im getting this error. Note that I have access to the model and my API key is there with the request.03 Forbidden: None.
Cannot access content at: https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B.
If you are trying to create or update content,make sure you have a token with the write role.
The model meta-llama/Meta-Llama-3-8B is too large to be loaded automatically (16GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).
L
a
W
27 comments
If you provided token=api_key, not really sure what the issue is there
I did provide it. Mistral/Zephyr work fine. I even tried to generate a write token instead of read and I got the same error
right, mistral and zephyr aren't protected the same way as llama3
When I try to call llama2 instead of llama3, I get that I need a pro subscription but I have access granted for llama2 as well so I really don’t know whats happening
but i can access llama3 without a subscription right?
i have access to both llama2 and llama3 yet none work
Attachment
image.png
its just a matter of how you pass the api token. Not 100% sure, Id have to go read some huggingface API docs
How is it related to huggingface api docs when I am accessing it thru llamaindex?
With llama-index you are creating a connection to interact with a llm. Llama3 may have some other requirements that may not be getting fulfilled in your request. I would suggest check the hf page for llama3 and see what all is required for the same
I did check and they say inference API access is there. Besides, if there were additional requirements, I wouldn't know llamaindex's syntax for them (e.g., attribute names) as you don't provide that in your documentation
anyway, forget about it. i have another issue and i would love your help in it. i believe its a bug from your end. i am using LlamaParse to parse my pdf documents to markdown (44 documents). The parsing was completed successfully, but once i use the MarkdownElementParser get_nodes_from_documents method to parse it into nodes, i get an error saying "ParseError error tokenizing data. C error: EOF inside string starting at row 0." but the documents passed are from LlamaParse directly so this error makes no sense implying that its a bug from LlamaParse. note that it worked previously with a test file. i would love if you would be able to connect me with a dev who worked on LlamaParse to fix this supposidely, bug
Seems like a tokenization issue with the LLM you are using? If you had a fully taceback, that would be helpful
Hm, I tried both Mistral & Zephyr through the HuggingFaceInference API but both gave the same error at around 9% completion
Let me send the traceback, one second
Plain Text
node_parser = MarkdownElementNodeParser(llm = llm)
nodes = node_parser.get_nodes_from_documents(documents, show_progress=True)
Seems like an issue with whatever pandas is trying to parse
Hard to fix without being able to reproduce
Should I try GPT as my llm, although I don't think thats the problem
I think its probably more related to the content being parsed yea
I will parse each document individually to find out which one is troublesome.
Add a reply
Sign up and join the conversation on Discord