tatanfort

I am trying to implement a sentence window retriever. In the exemple, the indexation of t

I am trying to implement a sentence window retriever.
In the exemple, the indexation of the document and the retriever are created in the same instance. So the output of the indexer is sentence_index and it is passed to the retriever.
However in my use case, I would like to create the sentence_index in one instance and the retriever on another one.
How can I import the sentence_index for the vector_store ?

4 comments

ttatanfort

I built a script to insert chunks in my vector store with llama index. I would like to us

I built a script to insert chunks in my vector store with llama index.
I would like to use an external API for the embedding. How to do that in Servicecontext ?

8 comments

ttatanfort

explain this error : nodes_retrieved = retriever.retrieve(AttributeError: type object 'r

explain this error :
nodes_retrieved = retriever.retrieve(
AttributeError: type object 'retriever' has no attribute 'retrieve'

4 comments

ttatanfort

I try to insert documents in a qdrant collection. The collection doesn't exist yet in my q

I try to insert documents in a qdrant collection. The collection doesn't exist yet in my qdrant cluster.
I use this code :

service_context = ServiceContext.from_defaults(llm=llm, embed_model=embedding_model)
vector_store = QdrantVectorStore(client=client, collection_name="RAG_llama_index_small_to_big")

But when I query my qdrant client it appears that the collection "RAG_llama_index_small_to_big" doesn't exist.
Should I create it before trying to insert, or does QdrantVectorStore create it if it doesn't exist yet ?

3 comments

ttatanfort

before building the VectorStoreIndex with qdrant client, should I create the collection, o

before building the VectorStoreIndex with qdrant client, should I create the collection, or does QdrantVectorStore create the collection if it doesn't exist already in the qdrant cluster?

3 comments

ttatanfort

I created a retriever with some filters : filter_retriever.filtersMetadataFilters(filte

@kapa.ai I created a retriever with some filters :

filter_retriever.filters
MetadataFilters(filters=[MetadataFilter(key='file_root', value='Code militaire', operator=<FilterOperator.EQ: '=='>)]

I then try to retrieve from my query :
filter_retriever.retrieve(query)

However the results contains indexes with the metadata 'file_root': 'Code de justice administrative' which doesn't respect the filter.

Why is my retriever not filtering properly on the metadatas?

10 comments

ttatanfort

I use a sentence-window retriever in this method : def retrieve(self, query):

@kapa.ai I use a sentence-window retriever in this method :
def retrieve(self, query):
return self.query_engine.retrieve(query)
It is using a qdrant vector store. I would like apply some filters based on the metadatas. How can I do that?

11 comments

ttatanfort

how to implement auto-retrieval with qdrant? The current retriever is using a sentence_win

@kapa.ai how to implement auto-retrieval with qdrant? The current retriever is using a sentence_window_query_engine

7 comments

ttatanfort

There is something I don't understand :

There is something I don't understand :
The meta data are not embedded, so they shouldn't impact the split process. However, I'm trying to implement a small to big retriever but with small chunk size I have this error message :

"Metadata length (130) is longer than chunk size (128). Consider increasing the chunk size or decreasing the size of your metadata to avoid this."

Can you explain the reason why and how to make the metadata not affect the splitting process.

Here is the piece of code I use :

sub_chunk_sizes = [128, 256, 512]
sub_node_parsers = [
SentenceSplitter.from_defaults(chunk_size=c,chunk_overlap=20) for c in sub_chunk_sizes
]

all_nodes = []
for base_node in tqdm(base_nodes):
for n in sub_node_parsers:
sub_nodes = n.get_nodes_from_documents([base_node])

24 comments

Find answers from the community

I am trying to implement a sentence window retriever. In the exemple, the indexation of t

I built a script to insert chunks in my vector store with llama index. I would like to us

explain this error : nodes_retrieved = retriever.retrieve(AttributeError: type object 'r

I try to insert documents in a qdrant collection. The collection doesn't exist yet in my q

before building the VectorStoreIndex with qdrant client, should I create the collection, o

I created a retriever with some filters : filter_retriever.filtersMetadataFilters(filte

I use a sentence-window retriever in this method : def retrieve(self, query):

how to implement auto-retrieval with qdrant? The current retriever is using a sentence_win

There is something I don't understand :