Hi everyone, I'm new to llamaindex. I'm

At a glance

Hi everyone, I'm new to llamaindex. I'm trying to run the IngestionPipeline code here https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/. The only thing I changed is the OpenAIEmbedding model to the HuggingfaceEmbedding model. However, the index instantiation failed (empty index). I check the nodes in the vector store vector_store.get_nodes() and the nodes' embeddings are all None. What could be the issue? I tried searching for this issue in github and google but no luck so far, any help's appreciated!

6 comments

LLogan M

Whats the actual code that you ended up running?

JJune Thai

This is the code

Plain Text

from llama_index.core import Document
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.extractors import TitleExtractor
from llama_index.core.ingestion import IngestionPipeline
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.ollama import Ollama

import qdrant_client

model_name = 'BAAI/bge-base-en-v1.5'
embed_model = HuggingFaceEmbedding(
    model_name=model_name, trust_remote_code=True)

documents = SimpleDirectoryReader("data").load_data()

Settings.embed_model = embed_model
Settings.llm = Ollama(model="llama3.1", request_timeout=360.0)

client = qdrant_client.QdrantClient(location=":memory:")
vector_store = QdrantVectorStore(client=client, collection_name="test_store")

pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(chunk_size=25, chunk_overlap=0),
        # TitleExtractor(),
        embed_model,
    ],
    vector_store=vector_store,
)

# Ingest directly into a vector db
pipeline.run(documents=documents)

nodes = vector_store.get_nodes()

# Create your index
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_vector_store(vector_store)

LLogan M

Seems to work fine for me
https://colab.research.google.com/drive/1eftieUGM2jiL7If13HtGoaBCOze5D14w?usp=sharing

The index is definitely not empty, and retrieval retrieves nodes just fine

JJune Thai

Thank @Logan M. Is there a way I can retrieve all the document embeddings from the index?

LLogan M

I think you'd have to use the underlying qdrant client

JJune Thai

Thank you! I'll look into it!

Add a reply

Find answers from the community

Hi everyone, I'm new to llamaindex. I'm