Find answers from the community

Updated 5 months ago

To have a RAG, do we need to run the

To have a RAG, do we need to run the below lines every time?!

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5") Settings.llm = Ollama(model="llama3", request_timeout=360.0)

can’t we just load llama3 and the embedding model offline from a local directory?
R
L
c
6 comments
yes, they're cached locally in ~/.cache/huggingface/hub by default. if you wanna use a different local directory then update the HUGGINGFACE_HUB_CACHE environment variable.
But the lines are needed. You need an embedding model and llm loaded from somewhere, since you need to embed queries and use the llm to generate responses 👍
@Rohan Thanks, how about the llama3 model? Where is that one being saved?
in ~/.ollama/models by default, can be updated by changing the OLLAMA_MODELS variable
Thanks @Rohan . Would you mind giving an example code? Cause I only know that I should run ollama pull llama3 which I believe saves the model in the directory you just mentioned. How should I use OLLAMA_MODELS variable you just mentioned with the ollama pull command?
sure. instead of running ollama pull you can run OLLAMA_MODELS="/home/user/models" ollama pull llama3 it'll store the model in /home/user/models instead of the default location
Add a reply
Sign up and join the conversation on Discord