Find answers from the community

Updated 4 months ago

Node Ingestion Issue

At a glance
Hey everyone !
I am using this doc to create and send vectors to my redis db : https://docs.llamaindex.ai/en/stable/examples/ingestion/redis_ingestion_pipeline.html#redis-ingestion-pipeline

However, even though I am correctly detecting 4 documents, it seems as though none of them are ingested. Am I doing something wrong ? Thanks in advance

PS : Redis should be correctly configured as I can freely set and get keys as well as ping my redis connection

Plain Text
 # Embedding model
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5", device="cpu")

# Configure ingestion pipeline
pipeline = IngestionPipeline(
transformations=[
    SentenceSplitter(),
    embed_model,
],
docstore=RedisDocumentStore.from_host_and_port(
    redis_pwd_host_nms, redis_port
),
vector_store=RedisVectorStore(
    index_name="redis_vector_store",
    index_prefix="vector_store",
    redis_url=f"redis://{redis_pwd_host_nms}:{redis_port}",
),
cache=IngestionCache(
    cache=RedisCache.from_host_and_port(redis_pwd_host_nms, redis_port),
    collection="redis_cache",
),
docstore_strategy=DocstoreStrategy.UPSERTS,
)

# This will return that 0 Nodes were ingested but got 4 documents
def get_nodes():
    # Load data as documents
    documents = SimpleDirectoryReader("./light_processed_data", filename_as_id=True).load_data()
    nodes = pipeline.run(documents=documents)
    return(f"Ingested {len(nodes)} Nodes but had {len(documents)} documents")

 
A
L
3 comments
@Logan M Do you have any pointers on what I could do to solve my problem please 🙏
How do you know they aren't ingested? They will only be ingested once, and if they havent changed, then they are skipped
Ah I think I found the source of my problem, I had already stored the text contents of my files in my redis db using their file names as the keys
Since redis seems to be strict on duplicates, when I tried to store my embedded versions of the same files with ‘filename_as_id=True’, it automatically rejected them since keys with the same name are already present
Add a reply
Sign up and join the conversation on Discord