Find answers from the community

Updated 2 years ago

Embeddings

Hey @Logan M , can you please help me here, I was searching so much around this. But did not get anything for this. This will really help.

10 comments

LLogan M

My apologies! Thought I caught that earlier

Any embeddings can be used from huggingface, following this guide

https://gpt-index.readthedocs.io/en/latest/how_to/customization/embeddings.html#custom-embeddings

LLogan M

You can pass in any model name from huggingface, the default model it loads is mpnet-v2 (and all the models there will run locally)

ddev_blockchain

Thank you so much, really appreciated

ddev_blockchain

Hey @Logan M i have just used this much of code and it's asking the OPENAI_KEY, is it right or what ?

Plain Text

from llama_index import GPTListIndex, SimpleDirectoryReader
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from llama_index import LangchainEmbedding, ServiceContext

# load in HF embedding model from langchain
embed_model = LangchainEmbedding(HuggingFaceEmbeddings())
service_context = ServiceContext.from_defaults(embed_model=embed_model)

LLogan M

Yes! It will still need openai for the llm predictor, which is a separate model. These models require a lot more powerful hardware to run locally compared to the embeddings, but it is possible

LLogan M

There are some good examples of using custom llms with llama index here (although none of these models actually work that great with llama index tbh lol, maybe you can find a better one)

https://github.com/autratec

ddev_blockchain

I just want to get rid of the cost of embedding the documents and saving them,

For querying the embedded data, we can go with the openAI.

ddev_blockchain

@Logan M Did we have such kind of codebase or reference, because it will save a lot of cost.

LLogan M

The example you had above will save you the costs of embeddings 👌

ddev_blockchain

thanks a lot

Add a reply