LlamaIndex

Log inLog into community

Find answers from the community

Updated 6 months ago

```py

```py

At a glance

The community member is trying to use the Llama Index library to query a ChromaDB vector store, but is encountering issues. The initial code sets up the ChromaDB and Llama Index components, but when querying the index, the community member receives a connection error, suggesting it is trying to use OpenAI instead of their own model.

The comments suggest that the community member needs to define the embedding model source in the service context, in addition to the language model. This should resolve the connection error. However, after making this change, the community member still receives an "Empty Response", which indicates the embedding model is unable to retrieve information from the vector store.

The community member tries various approaches, including rebuilding the vector store, adjusting the context window and max_new_tokens parameters, and comparing the collection UUIDs. However, they are unable to get the system to consistently respond with the expected information from the vector store.

There is no explicitly marked answer in the comments, but the community members continue to troubleshoot the issue and share their findings.

Useful resources

·

Plain Text

import chromadb
from llama_index import VectorStoreIndex
from llama_index.vector_stores import ChromaVectorStore
from llama_index.storage.storage_context import StorageContext
from llama_index.llms import LlamaCPP
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext

# initialize client
db = chromadb.PersistentClient(path="./chroma_db")
llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path="./models/em_german_13b_v01.Q8_0.gguf",
    temperature=0.1,
    max_new_tokens=4048,
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=8128,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    # model_kwargs={"n_gpu_layers": 1},
    # transform inputs into Llama2 format
    # messages_to_prompt=messages_to_prompt,
    # completion_to_prompt=completion_to_prompt,
    verbose=True,
    
)
# get collection
chroma_collection = db.get_or_create_collection("quickstart")

# assign chroma as the vector_store to the context
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(llm=llm)

# load your index from stored vectors
index = VectorStoreIndex.from_vector_store(
    vector_store, storage_context=storage_context, service_context=service_context
)

# create a query engine
query_engine = index.as_query_engine()
response = query_engine.query("Hallo, wie geht es dir?")
print(response)

Gives me a Connection error. Seems like its trying to work with OpenAI. Is there a way to make it work w my Model?

W

T

L

54 comments

You have only defined the llm part in the service_context . It requires embedding model source as well. Since it is not defined it is going to OpenAI for it.

Just add the following and it should work!!

Plain Text

service_context = ServiceContext.from_defaults(llm=llm, embed_model="local")

It will use the default opensource model for embedding this way.

For more: https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings.html#local-embedding-models

Okay. If i define it it gives me a "Empty Response", Is that when it doesnt know wjat to say?

If the response returns empty response, Can you check if the source nodes contain anything.

You can check via print(response.source_nodes)

If it is coming empty then it means the embedding model is unable to retrieve from the vectors

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
Empty Response
[]

It gives me that

So yea, its empty

So meaning the DB is borked? Or does it just not have Info abt what i asked?

How did you created the vector store, IN the first place? Have you fed any data in the vector store yet?

Yes. It does have data in it. Abt 10 PDF's about Agrar stuff. And its fully build. Its a ChromaDB

But no matter what i ask. I get no response

Lol, I was going to say try with some other queries 😅

Gonna try sth that is defo in the PDF

still empty response

Lets try it in the building process

Can you try creating it from the start once again: https://docs.llamaindex.ai/en/stable/examples/vector_stores/ChromaIndexDemo.html#creating-a-chroma-index ?

Could be some issue in the vectors maybe

doing that rn actually

Printing response
DerProzess„Düngen“beinhaltetmehrereSubprozesse,dieengverzahntineinandergreifen:
•„Anbauplanen“istdieVerknüpfungeinerKulturartmiteinerBewirtschaftungs-
einheitineinembestimmtenZeitraum.DieKulturart,sowiedieMengeundArt
desErnteproduktesbestimmtimWesentlichendieAusbringungvonNährstoff-
mengen.
•„Bodenuntersuchungeingeben“beinhaltetdenTransfervon(teilflächenspezifischen)BodenuntersuchungsergebnissenodervonLandesbehördenzurVerfügung
gestellteNmin-GehalteindieAnwendung.
•„Düngebedarfermitteln“umfasstdieBerechnungderNährstoffmengeeinesErn-
teproduktes,diedenNährstoffbedarfnachAbzugsonstigerverfügbarerNähr-
stoffmengenundunterBerücksichtigungderteilflächenspezifischenNährstoffver-
sorgungdesBodensabdeckt.
•„Düngebedarfdecken“isteinteilflächenspezifischerOptimierungsansatzzur
DeckungdesDüngebedarfs(sieheAbb.1).
•„Düngungplanen“liefertalsErgebniseineernteprodukt-undteilflächenspezifischePlanungvonDüngemaßnahmeninkl.Applikationskarte.
InderAnwendungmüssendieProzessevomUserdurchlaufenundggf.umfehlende Informationenangereichertwerden.InAbb.1istderProzess„Düngebedarfdecken“
exemplarischdargestellt.DerUser(„Besitzer“)wirddabeivonderAnwendung(„Sys-
tem“)durchdenProzessbegleitet.WennalleInformationenausdenvorherigenProzes-
senkorrekterfasstunddieGeschäftsregelnkorrektformuliertsind,somussderAnwen-
derlediglicheinErgebnisbestätigen.
Abb.1:AusschnittausBPMNDiagrammfür„Düngebedarfdecken“
taym@tays-MacBook-Pro AiAssistant %

Oh hell

I mean it is right

Just

it has no spaces between the words

Awesome !! now the response is finally coming
Yeah, You could try providing instruction to the model for how it has to generate and ask again.

Okay so, when i save the ChromaDB it doesnt work. It doesnt respond

So somethig is defo borked

Can you check, If the new file got stored in your chromaDB

Attachment

But i just tried sth

Gives me a File Exists error

Attachment

Can you try providing just ./chroma_db in the path, as this much is only mentioned in the docs
https://docs.llamaindex.ai/en/stable/examples/vector_stores/ChromaIndexDemo.html#basic-example-including-saving-to-disk

I did that and there i gave me empty response

I think you've set the context window and amx_new_tokens waaayy too big 😅

What model is this based off of?

i would try max_new_tokens=512 and context_window=3900 as a safe starting point

Its running on Em German 13b

By TheBloke I think

With the context window it hissed at me when I had it at 3900 that it can't go into the negatives

lol wut

hmm

yee

Okay i tried so many things. Nothing really works. Could it be a issue with my hardware? idfk

sorry for the ping

May have found a solution

Just sadly takes long

My solution did in fact not work.
But in the file where it was building it so it had it already.
I got this response lol:

Plain Text

DerProzess„Düngen“beinhaltetmehrereSubprozesse,dieengverzahntineinandergreifen:
•„Anbauplanen“istdieVerknüpfungeinerKulturartmiteinerBewirtschaftungs-
einheitineinembestimmtenZeitraum.DieKulturart,sowiedieMengeundArt
desErnteproduktesbestimmtimWesentlichendieAusbringungvonNährstoff-
mengen.
•„Bodenuntersuchungeingeben“beinhaltetdenTransfervon(teilflächenspezifischen)BodenuntersuchungsergebnissenodervonLandesbehördenzurVerfügung
gestellteNmin-GehalteindieAnwendung.
•„Düngebedarfermitteln“umfasstdieBerechnungderNährstoffmengeeinesErn-
teproduktes,diedenNährstoffbedarfnachAbzugsonstigerverfügbarerNähr-
stoffmengenundunterBerücksichtigungderteilflächenspezifischenNährstoffver-
sorgungdesBodensabdeckt.
•„Düngebedarfdecken“isteinteilflächenspezifischerOptimierungsansatzzur
DeckungdesDüngebedarfs(sieheAbb.1).
•„Düngungplanen“liefertalsErgebniseineernteprodukt-undteilflächenspezifischePlanungvonDüngemaßnahmeninkl.Applikationskarte.
InderAnwendungmüssendieProzessevomUserdurchlaufenundggf.umfehlende Informationenangereichertwerden.InAbb.1istderProzess„Düngebedarfdecken“
exemplarischdargestellt.DerUser(„Besitzer“)wirddabeivonderAnwendung(„Sys-
tem“)durchdenProzessbegleitet.WennalleInformationenausdenvorherigenProzes-
senkorrekterfasstunddieGeschäftsregelnkorrektformuliertsind,somussderAnwen-
derlediglicheinErgebnisbestätigen.
Abb.1:AusschnittausBPMNDiagrammfür„Düngebedarfdecken“

So no Spaces. Still didnt get it to save and make it accessible that i dont have to rebuild the DB every time i ask it sth

name='quickstart' id=UUID('0a6d6743-bbef-4f98-9897-6ff994a84a1b') metadata=None

Hmm thats the collection

The collection UUID is diff

Building DB 
name='quickstart' id=UUID('560f6a12-e64b-467b-9a5f-3b4393fa160f') metadata=None

Getting DB
name='quickstart' id=UUID('0a6d6743-bbef-4f98-9897-6ff994a84a1b') metadata=None

ngl I'm way out of the loop now -- what are you doing exactly?

I copied the code entirely as the guide tells me to. Still didn't work. But I tried comparing the collection UUID's. From the file where I saved the chromadb. And then the file where I then tried getting the previously saved chromadb

Which code?

In the Llama Index Docs

The sample code

For chroma?

Loading a previous chromadb is ez pz

Plain Text

# save to disk
db = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, service_context=service_context
)

# load from disk
db2 = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = db2.get_or_create_collection("quickstart")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
index = VectorStoreIndex.from_vector_store(
    vector_store,
    service_context=service_context,
)

this works for me

imma try it tomorrow again. I have end of work

Add a reply

Sign up and join the conversation on Discord