Hi guys, wondering one thing.
I have these pipelines, however they create new data every time they are run (so after 3 runs and retrieval top_k = 3 they all retrieve the same text)...
Why?
pipelines = {
"QA": IngestionPipeline(
transformations=[
SentenceSplitter(paragraph_separator="\n\n\n", chunk_size=300, chunk_overlap=20),
TitleExtractor(),
OpenAIEmbedding(model="text-embedding-3-large"),
],
vector_store=self.vector_store,
cache=IngestionCache(),
),
"Klubista": IngestionPipeline(
transformations=[
SentenceSplitter(chunk_size=400, chunk_overlap=50),
TitleExtractor(),
OpenAIEmbedding(model="text-embedding-3-large"),
],
vector_store=self.vector_store,
cache=IngestionCache(),
),
"PrevadzkovyPoriadok": IngestionPipeline(
transformations=[
SentenceSplitter(chunk_size=400, chunk_overlap=50),
TitleExtractor(),
OpenAIEmbedding(model="text-embedding-3-large"),
],
vector_store=self.vector_store,
cache=IngestionCache(),
),
"OtherDocs": IngestionPipeline(
transformations=[
SentenceSplitter(chunk_size=400, chunk_overlap=50),
TitleExtractor(),
OpenAIEmbedding(model="text-embedding-3-large"),
],
vector_store=self.vector_store,
cache=IngestionCache(),
),
}