Find answers from the community

Updated 5 months ago

To have a RAG, do we need to run the

To have a RAG, do we need to run the below lines every time?!

Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")

Settings.llm = Ollama(model="llama3", request_timeout=360.0)

can’t we just load llama3 and the embedding model offline from a local directory?

6 comments

RRohan

yes, they're cached locally in ~/.cache/huggingface/hub by default. if you wanna use a different local directory then update the HUGGINGFACE_HUB_CACHE environment variable.

LLogan M

But the lines are needed. You need an embedding model and llm loaded from somewhere, since you need to embed queries and use the llm to generate responses 👍

cch7yma_v

@Rohan Thanks, how about the llama3 model? Where is that one being saved?

RRohan

in ~/.ollama/models by default, can be updated by changing the OLLAMA_MODELS variable

cch7yma_v

Thanks @Rohan . Would you mind giving an example code? Cause I only know that I should run ollama pull llama3 which I believe saves the model in the directory you just mentioned. How should I use OLLAMA_MODELS variable you just mentioned with the ollama pull command?

RRohan

sure. instead of running ollama pull you can run OLLAMA_MODELS="/home/user/models" ollama pull llama3 it'll store the model in /home/user/models instead of the default location

Add a reply