LlamaIndex

Log inLog into community

Find answers from the community

Updated 6 months ago

How do I find an old thread I was on?

How do I find an old thread I was on?

At a glance

The post is about a community member trying to find an old thread they were on. The comments discuss how to use a locally running llama.cpp server with llamaindex, including the specific code needed to set up the OpenAILike model and ServiceContext. Community members provide step-by-step guidance on how to configure the embedding model, download the BAAI model, and integrate it into the code. They also troubleshoot issues the community member is facing, such as errors related to the lack of an OpenAI API key. The community members work together to help the original poster get the local setup working without any data being transferred online.

Useful resources

ddigital_dream64

·

How do I find an old thread I was on?

L

d

46 comments

You can use the search above and search for your name 🤔

ddigital_dream64

Ahh ok thanks!

ddigital_dream64

Ok so funny thing is the thread I was looking for actually had the solution or part of it suggested by you. It was about how to use a locally running llama.cpp server in llamaindex and how to do it... would it be alright if I asked how to do it or where to start?

ddigital_dream64

I just don't understand the syntax for including a local llm

Right so, I think you just need to use the OpenAILike to connect to the llama-cpp server

Plain Text

from llama_index.llms import OpenAILike
from llama_index import ServiceContext, set_global_service_context

llm = OpenAILike(model="..", api_base="http://127.0.0.1:8000/api/v1", api_key="fake", context_window=4096, is_chat_model=False)

service_context = ServiceContext.from_defaults(llm=llm, embed_model=...)
set_global_service_context(service_context)

Depending on the llm you are using, you might need to setup some prompt formatting stuff, but it depends

ddigital_dream64

Ok gotcha

ddigital_dream64

So I did see that part but I'm confused about how to implement that into the five lines of code example

ddigital_dream64

Like sort of where it goes

ddigital_dream64

Do I just add these lines into the five lines of code example?

pretty much 🙂 I skipped setting up an embedding model, you can set embed_model="local" or configure whatever embedding you need

ddigital_dream64

And if so, do my documents still get uploaded to the embedding model? I'm trying to make everything local

ddigital_dream64

Ahh ok

ddigital_dream64

so for the embed model, where can I direct it to the path of the embedding model?

ddigital_dream64

I can just download the BAAI model and keep it on my laptop right?

You can put embed_model="local:BAAI/bge-base-en-v1.5" and it will automatically cache it

If you want to provide it a specific path, you gotta pull out the embedding class 😅

Plain Text

from llama_index.embeddings import HuggingFaceEmbedding
embed_model = HuggingFaceEmbedding(modal_name="<path_to_my_model>")

ddigital_dream64

okkkkk

ddigital_dream64

I will try that and get back to you... thank you so much!!

ddigital_dream64

I'm trying a project which requires absolutely no data be transferred online, hence the questions... I'm trying to avoid using any OpenAI or other cloud based LLM sources

ddigital_dream64

Is there a way for me to pin this thread for myself?

hmmm I don't think so 🤔 Probably best to save this info somewhere lol

I'm always around to answer more as well haha

ddigital_dream64

Ok gotcha, I've got the screenshots

ddigital_dream64

Of course thank you!

ddigital_dream64

Sorry to be back so soon, how do I download the BAAI model? It doesn't seem to be a single gguf file. Should I download it and quantize it and convert it like I do with the raw LLama2 model>

ddigital_dream64

?*

embed_model = HuggingFaceEmbedding(modal_name="BAAI/bge-base-en-v1.5") will download the model at cache it for you

Atlernatively, you can clone and download any huggingface model using something like

Plain Text

git lfs install
git clone https://huggingface.co/BAAI/bge-base-en-v1.5

Embedding models usually aren't gguf, they are usually a collection of stuff needed for huggingface and pytorch to load the model

ddigital_dream64

Oh ok so I can just refer to the folder in the code then?

ddigital_dream64

And yeah I did that and I suppose its not causing any error because the error I'm getting refers to the lack on an OpenAI api key. I'm trying to use the Llamacpp integration for the llm but I suppose I've done something wrong.

Can you share the code?

ddigital_dream64

yup

ddigital_dream64

one moment

ddigital_dream64

Thats the error

ah, move the service_context and set_global_service_context to before creating the index

ddigital_dream64

Oh sorry deleted it by accident

ddigital_dream64

was trying to reformat it as code

ddigital_dream64

got it

ddigital_dream64

No errors!!

ddigital_dream64

I think its working... I'll play around it a bit

ddigital_dream64

So the service_context is needed because it defines the llm or embedding model which is used in creating the index right?

exactly 🙂

(in the future, this will be a lot easier to do! I realize its a tad clunky)

ddigital_dream64

No its alright. I had previously looked in on the documentation about a month or two ago and things are already better... the community also seems to be pretty active

ddigital_dream64

and tbh as long as it work

ddigital_dream64

works*

ddigital_dream64

😂 👍

Add a reply

Sign up and join the conversation on Discord