Find answers from the community

Home
Members
tatanfort
t
tatanfort
Offline, last seen 3 months ago
Joined September 25, 2024
I am trying to implement a sentence window retriever.
In the exemple, the indexation of the document and the retriever are created in the same instance. So the output of the indexer is sentence_index and it is passed to the retriever.
However in my use case, I would like to create the sentence_index in one instance and the retriever on another one.
How can I import the sentence_index for the vector_store ?
4 comments
L
t
k
I built a script to insert chunks in my vector store with llama index.
I would like to use an external API for the embedding. How to do that in Servicecontext ?
8 comments
L
t
s
k
explain this error :
nodes_retrieved = retriever.retrieve(
AttributeError: type object 'retriever' has no attribute 'retrieve'
4 comments
L
k


I try to insert documents in a qdrant collection. The collection doesn't exist yet in my qdrant cluster.
I use this code :

service_context = ServiceContext.from_defaults(llm=llm, embed_model=embedding_model)
vector_store = QdrantVectorStore(client=client, collection_name="RAG_llama_index_small_to_big")

But when I query my qdrant client it appears that the collection "RAG_llama_index_small_to_big" doesn't exist.
Should I create it before trying to insert, or does QdrantVectorStore create it if it doesn't exist yet ?
3 comments
L
k
before building the VectorStoreIndex with qdrant client, should I create the collection, or does QdrantVectorStore create the collection if it doesn't exist already in the qdrant cluster?
3 comments
L
k
@kapa.ai I created a retriever with some filters :

filter_retriever.filters
MetadataFilters(filters=[MetadataFilter(key='file_root', value='Code militaire', operator=<FilterOperator.EQ: '=='>)]

I then try to retrieve from my query :
filter_retriever.retrieve(query)

However the results contains indexes with the metadata 'file_root': 'Code de justice administrative' which doesn't respect the filter.

Why is my retriever not filtering properly on the metadatas?
10 comments
L
t
k
@kapa.ai I use a sentence-window retriever in this method :
def retrieve(self, query):
return self.query_engine.retrieve(query)
It is using a qdrant vector store. I would like apply some filters based on the metadatas. How can I do that?
11 comments
k
t
@kapa.ai how to implement auto-retrieval with qdrant? The current retriever is using a sentence_window_query_engine
7 comments
k
t
There is something I don't understand :
The meta data are not embedded, so they shouldn't impact the split process. However, I'm trying to implement a small to big retriever but with small chunk size I have this error message :

"Metadata length (130) is longer than chunk size (128). Consider increasing the chunk size or decreasing the size of your metadata to avoid this."

Can you explain the reason why and how to make the metadata not affect the splitting process.

Here is the piece of code I use :

sub_chunk_sizes = [128, 256, 512]
sub_node_parsers = [
SentenceSplitter.from_defaults(chunk_size=c,chunk_overlap=20) for c in sub_chunk_sizes
]

all_nodes = []
for base_node in tqdm(base_nodes):
for n in sub_node_parsers:
sub_nodes = n.get_nodes_from_documents([base_node])
24 comments
t
L
S
k