Find answers from the community

Updated 3 months ago

Rate limit

But I guess main reason is, though I am using LLM from HuggngFaceHub, via Service Context, it is still searching for OpenAI

File "C:\Users\yoges\anaconda3\envs\langchain\Lib\site-packages\tenacity__init.py", line 382, in call__
result = fn(args, kwargs) ^^^^^^^^^^^^^^^^^^^ File "C:\Users\yoges\anaconda3\envs\langchain\Lib\site-packages\llama_index\embeddings\openai.py", line 150, in get_embeddings data = openai.Embedding.create(input=list_of_text, model=engine, kwargs).data ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\yoges\anaconda3\envs\langchain\Lib\site-packages\openai\api_resources\embedding.py", line 33, in create response = super().create(args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\yoges\anaconda3\envs\langchain\Lib\site-packages\openai\api_resources\abstract\engine_apiresource.py", line 153, in create response, , api_key = requestor.request(
^^^^^^^^^^^^^^^^^^
File "C:\Users\yoges\anaconda3\envs\langchain\Lib\site-packages\openai\api_requestor.py", line 230, in request
resp, got_stream = self._interpret_response(result, stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\yoges\anaconda3\envs\langchain\Lib\site-packages\openai\api_requestor.py", line 624, in _interpret_response
self._interpret_response_line(
File "C:\Users\yoges\anaconda3\envs\langchain\Lib\site-packages\openai\api_requestor.py", line 687, in _interpret_response_line
raise self.handle_error_response(
openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details.
Now the error is being generated in langchain land, but the suggested version does not solve it
L
Y
8 comments
Looks like a rate limit error now πŸ˜… Do you have payment info on your openai account?
My OpenAI account is exhausted that's why trying HuggingFace LLM
But Langchain does not know about it I think
Should LLamaIndex Service Context pass that infor to Langchain by some means?
Right. You'll still need to set an embed model though (llama index uses two separate models)

You can run a local embed model from huggingface to avoid openai

https://gpt-index.readthedocs.io/en/latest/how_to/customization/embeddings.html#custom-embeddings
The error mentions langchain because we use some basic langchain classes under the hood πŸ™‚
Following seems to work, so two models, one for LLMPredictor and one for embeding (by default its sentence transformer inside)... looks ok? @Logan M
repo_id = "tiiuae/falcon-7b"
embed_model = LangchainEmbedding(HuggingFaceEmbeddings())

llm_predictor = LLMPredictor(llm=HuggingFaceHub(repo_id=repo_id,
model_kwargs={"temperature": 0.1, 'truncation': 'only_first',
"max_length": 512}))
service_context = ServiceContext.from_defaults(chunk_size=64, llm_predictor=llm_predictor, embed_model=embed_model)
Looks good to me! (64 chunk size is pretty small though lol but up to you)
Add a reply
Sign up and join the conversation on Discord