Find answers from the community

Updated 2 months ago

Qdrant

I'm not sure if this is the right place to ask but when running this code, assuming my NOTES (documents) have a length of 2500, after adding it to Qdrant, and looking at the vectors_count, it is around 1300. I would assume if I add 2500 docs, based on the chunking, I would have at least 2500+ vectors_count?

Plain Text
client = qdrant_client.QdrantClient(url="http://localhost:6333")
vector_store = QdrantVectorStore(client=client, collection_name="NOTES")

pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(chunk_size=512, chunk_overlap=0),
        HuggingFaceEmbedding(model_name='XXXXX'),
    ],
    vector_store=vector_store,
)

pipeline.run(documents=NOTES)
W
c
L
3 comments
True, it should atleast show 2500 count πŸ‘€
Found out some of the docs were empty Text which can explain part of the reason.
Yes! It skips empty docs, since otherwise the embedding model kind of explodes lol
Add a reply
Sign up and join the conversation on Discord