You can create custom embeddings by adding custom embedding model to the service context. Then pass this service context when you are reading the docs. If you do not pass then it will take default embedding model. i.e OpenAI
from llama_index import LangchainEmbedding, ServiceContext
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2"))
service_context = ServiceContext.from_defaults(chunk_size_limit=512, embed_model=embed_model)
You can set this service context as global, then you won't have to worry about passing it anywhere.
thanks for the help. But suppose I have the embeddings ready and I want to ingest these embeddings without using openai. Is it possible to do that ?
thanks for the response. I have a particular use case in mind. Suppose I Index my custom embeddings and ask a query (using custom embeddings) to get top_n results . I want to use these top_n results as input for openai to get the answer for the query. Is it possible to do this ?
Since you are only passing the embed model. The llm part will be default for querying which is openai
that's great thanks for confirmation and being patient🙂
Hi, there! I was able to use custom embedding model for indexing and retrieving. But for final prediction I'm getting rate limit error from openAI. can you help me with this? Context size is just 1 paragraph. I also mentioned the openAI API Key.
for reference :
DEBUG:openai:api_version=None data='{"prompt": "Context information is below.\n---------------------\n\u201c Borrowing Request \u201d means a request by the Borrower for a Borrowing in accordance with Section 2.03 substantially in the form of Exhibit A.\n\n\u201c Borrower \u201d has the meaning specified in the preamble hereto.\n\n\u201c Borrowing \u201d means Loans of the same Class and Type, made, converted or continued on the same date and, in the case of Eurodollar Loans, as to which a single Interest Period is in effect.\n---------------------\nGiven the context information and not prior knowledge, answer the question: What is the borrower name ?\n", "stream": false, "model": "text-davinci-003", "temperature": 0.0, "max_tokens": 3953}' message='Post details'
I think rate limiting is handled in llamaIndex🧐 . Did you started getting this in the first query or started getting in between?
yes it is the first query.
Is this the right way to give openAI API key ? Because I'm getting rate limit error even though I have a paid plan.
os.environ["OPENAI_API_KEY"] = "my_API_KEY"
embModelClass =INSTRUCTOR("hkunlp/instructor-xl")
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003"))
service_context = ServiceContext.from_defaults(chunk_size_limit=512, embed_model=embed_model,llm_predictor=llm_predictor)
index=VectorStoreIndex.from_documents(
data,
service_context=service_context
# response_synthesizer=response_synthesizer,
)
What is llamaIndex version that you are trying with? Becuase in recent version of llamaIndex llm_predictor
name has been changed to llm
I would suggest you try it like this once
service_context = ServiceContext.from_defaults(chunk_size_limit=512, embed_model=embed_model)
Also your embed_model is defined with embModelClass
variable
service_context = ServiceContext.from_defaults(chunk_size_limit=512, embed_model=embed_model)
I tried this but I'm still getting the rate limit error.
Yes, that is a custom embedding model which mocks the expected behaviour. It is working fine. Code is throwing error when it is trying to send a request to openAI.
Can you try with GPT-3.5? Rate limiting errors are increasing way too much
from langchain.chat_models import ChatOpenAI
llm_predictor = LLMPredictor(llm=ChatOpenAI(openai_api_key=OPENAI_API_KEY,temperature=0, max_tokens=1024, model_name="gpt-3.5-turbo"))
pass this into your service context.
unfortunately, this is also throwing the same error.
Can you try making a sample request to OpenAI in a python script. If that is also not working maybe then its something else.
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
response = openai.Completion.create(
model="text-davinci-003",
prompt="",
temperature=1,
max_tokens=256,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
this program is running without any error
lol. I guess we need to call @Logan M here 😅
Are you running in colab? I've seen that openai will severely rate limit calls from colab servers
No, I'm running a python file.
I have no idea then, I'm lost 😅
Yeah, just to be sure I'm lost I asked @VallalaDev to run the sole openai query to see if there is something wrong at openai side altogether but it worked.
😅
Maybe you could downgrade the openai version 🤔
@WhiteFang_Jr
I changed this line
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo",openai_api_key=<key>))
and I'm getting predictions.
thanks for the help guys.
So by adding openai_api_key
it worked. Maybe it is not able to pick env API key in new version.
?
Also @VallalaDev If you want to use GPT3.5 you should use ChatOpenAI
we tried ChatOpenAI with gpt3.5 Turbo, but it didn't work .
Try passing the key variable in there too
That's the only change you did right?
Oh i see it already has the key variable. Anyway it worked for you 😅
I'll check with your version in evening.
I also changed the model_name from text-davinci-003 to gpt3.5 Turbo