Hey hope everyone is doing good!
I have a question. Well more something of an issue. I have implemented a pipeline for uploading documents with embeddings to Azure congnitive store using the CognitiveSearchVectorStore class and IngestionPipeline. Every goes well but the search results using the default querry engine (as_query_engine) are absolutly terrible. It cant answer simple questions like who the ceo of the company im working for is (while there are a lot of documents mentioning whoe it is). Also when searching using the azure portal search explorer the results are perfect.
Ive used the setup from the docs example with the IngestionPipeline example.
....
azure_vector_store = CognitiveSearchVectorStore(
search_or_index_client=azure_index_client,
index_name=azure_index_name,
filterable_metadata_field_keys=metadata_fields, index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
id_field_key="id",
chunk_field_key="content",
embedding_field_key="embedding",
metadata_string_field_key="li_jsonMetadata",
doc_id_field_key="li_doc_id",
embedding_dimensionality=embedding_dimensionality,
)
....
text_splitter = SentenceSplitter(
separator=" ",
chunk_size=1000,
chunk_overlap=50, tokenizer=tiktoken.encoding_for_model(llm_model).encode,
include_metadata=True
)
pipeline = IngestionPipeline(
transformations=[
text_splitter,
CleanHTMLTransform(),
embed_model
],
cache=cache,
)
pipelined_nodes = pipeline.run(
documents=documents,
in_place=True,
show_progress=True,
)
azure_vector_store.add(pipelined_nodes)