kcramp

im following the "Retriever Query Engine

im following the "Retriever Query Engine with Custom Retrievers - Simple Hybrid Search" tutorial with my data
it isnt working because I dont have any nodes, all of my data is already in the vector store.

i assume you need to have nodes because in the example they use nodes for SimpleKeywordTableIndex. when I use the index instead of nodes, I get an error.

here is the doc code:

from llama_index.core import SimpleKeywordTableIndex, VectorStoreIndex

vector_index = VectorStoreIndex(nodes, storage_context=storage_context)
keyword_index = SimpleKeywordTableIndex(nodes, storage_context=storage_context)

so.. how do I get my nodes back out of the index after it is already done?
i am using weaviate.
if I do index.docstore.docs.values()., I get an empty dictionary

i dont know what to give simpleKeywordTableIndex because I am only getting the index via
"vector_store = WeaviateVectorStore(
weaviate_client=client, index_name="LatestMetadataChunk512Docstore"
)

loaded_index = VectorStoreIndex.from_vector_store(vector_store)"

how am I supposed to get the nodes? code is here:
-----
from llama_index.core import StorageContext
from llama_index.core import SimpleKeywordTableIndex, VectorStoreIndex

initialize storage context (by default it's in-memory)

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(loaded_nodes)

vector_index = VectorStoreIndex(nodes, storage_context=storage_context)
keyword_index = SimpleKeywordTableIndex(nodes, storage_context=storage_context)
-----

i re-ran my ingestion pipeline and included the insert into docstore code, thinking i could get the nodes out if I could put them in a docstore, but ater I reran the script I dont know how to access that docstore now, i assumed that index.docstore.docs.values() would not be an empty dictionary anymore after that, but it still is.

1 comment

kkcramp

but now I get TypeError:

but now I get TypeError: AsyncCompletions.create() got an unexpected keyword argument 'tools'.. so we can't use tools with openAIlike?

2 comments

kkcramp

I was using ollama to do the property

I was using ollama to do the property graph construction, but it was too slow. I got an exl2 quant, installed exllama, and installed tabbyAPI. when I try to use llm = OpenAILike(model="text2cypher-codestral-exl2-4.0bpw", api_base="http://127.0.0.1:5000/v1/", api_key="fake", temperature=0.1, top_p=0.1, top_k=40, repetition_penalty=1.18), it only passes the temperature, nothing else.
also, it seems that the model runs forever when using this (if I say what is your name, it will start asking itself multiple questions after that; I assume this is a settings issue?? I did not have this problem with ollama).

when I try to run

kg_extractor = SchemaLLMPathExtractor(
llm=llm,
possible_entities=entities,
possible_relations=relations,
kg_validation_schema=validation_schema,
strict=True,
max_triplets_per_chunk=5,
num_workers=1,
)

index = PropertyGraphIndex.from_documents(
documents,
embed_model=embed_model,
kg_extractors=[kg_extractor],
property_graph_store=graph_store,
show_progress=True,
)
it just runs forever and no api calls are made.

anyone have any experience with this openailike endpoint?

1 comment

kkcramp

So.. it seems there is no way to filter

So.. it seems there is no way to filter nodes down by strings, because all of the metadata search examples do not include string search (e.g. your metadata is an email subject and you want to find one string in the subject). Is that correct? I have tried every metadata approach and have been unsuccessful.

Is there some sort of approach where we can first use keyword search (e.g. a PO number, a string, etc) to narrow down the available nodes and THEN do the vector search? Am I just stupid?

8 comments

Find answers from the community

im following the "Retriever Query Engine

initialize storage context (by default it's in-memory)

but now I get TypeError:

I was using ollama to do the property

So.. it seems there is no way to filter