Find answers from the community

Updated 4 months ago

Anyone tried the new alpha release yet?

At a glance
Anyone tried the new alpha release yet? Definitely open to comments on any of the change made.

My favourite new feature is the new IngestionPipeline + cache

Plain Text
client = qdrant_client.QdrantClient(location=":memory:")
vector_store = QdrantVectorStore(client=client, collection_name="test_store")

pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(chunk_size=25, chunk_overlap=0),
        TitleExtractor(),
        OpenAIEmbedding(),
    ],
    cache=IngestionCache(cache=RedisCache(), collection="test_cache"),
    vector_store=vector_store,
)

# Ingest directly into a vector db
pipeline.run(documents=[Document.example()])

# Create your index
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_vector_store(vector_store)
K
L
k
19 comments
Does this mean that you can dynamically build an index as your application operates?
e.g start with some set of documents, user uploads more, you run pipeline.run to add those documents to the vector store?
That was a big motivation for this -- separating ingestion from normal querying.

So now, you could technically deploy a microservice API just for ingestion
also the transformations thing is cool too -- makes it very clear what's happening to your data
this is HUGE!
This completely changes how my indexing functionalities work
Before, when people added documents I would create a separate index and then hook the agent up to that index by re-initializing the agent with that new index as a separate tool
now it can just query the same index!
yea as long as you are using some vector db backend (like qdrant, pinecone, etc.) this should work for that!
Oh it doesn't work for in-memory vector stores?
It should work for chroma and qdrant too. I suspect to for base vector store πŸ€”
This is sick! Just wondering though, were you previously not be able to run just the ingestion inside of something like an AWS lambda function?
You could, but it wasn't exactly clear how to do this

This has clearer intentions, and is a lot more customizable
Heck yes, love that! Def gonna play with this today
let me know if you run into any issues or have questions! A full release + proper docs is coming later this week πŸ™‚
Hey quick question, off the top of your head - would this work with pgvector/supabase?
It would! Or at least it should lol
Lol sick, I'll try this out later today! I'm not married to pgvector but figured I'd ask before going that route
Add a reply
Sign up and join the conversation on Discord