Find answers from the community

Home
Members
MilkMilker
M
MilkMilker
Offline, last seen 4 months ago
Joined September 25, 2024
Hey hope everyone is doing good!
I have a question. Well more something of an issue. I have implemented a pipeline for uploading documents with embeddings to Azure congnitive store using the CognitiveSearchVectorStore class and IngestionPipeline. Every goes well but the search results using the default querry engine (as_query_engine) are absolutly terrible. It cant answer simple questions like who the ceo of the company im working for is (while there are a lot of documents mentioning whoe it is). Also when searching using the azure portal search explorer the results are perfect.

Ive used the setup from the docs example with the IngestionPipeline example.

Plain Text
....
azure_vector_store = CognitiveSearchVectorStore(
    search_or_index_client=azure_index_client,
    index_name=azure_index_name,
    filterable_metadata_field_keys=metadata_fields,  index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
    id_field_key="id",
    chunk_field_key="content",
    embedding_field_key="embedding",
    metadata_string_field_key="li_jsonMetadata",
    doc_id_field_key="li_doc_id",
    embedding_dimensionality=embedding_dimensionality,
)
....
text_splitter = SentenceSplitter(
    separator=" ",
    chunk_size=1000,
    chunk_overlap=50,         tokenizer=tiktoken.encoding_for_model(llm_model).encode,
    include_metadata=True
)
pipeline = IngestionPipeline(
    transformations=[
        text_splitter,
        CleanHTMLTransform(),
        embed_model
    ],
    cache=cache,
)

pipelined_nodes = pipeline.run(
    documents=documents,
    in_place=True,
    show_progress=True,
)
azure_vector_store.add(pipelined_nodes)
17 comments
M
L
Hey is it possible to delete/remove/disable nltk. I updated to llama-index 0.10.26 and my deployments are failing because of "Resource stopwords not found.". It used to work fine before the update and afaik Im not using any nltk functionalities, at least not directly? Does anyone know a fix for this?
5 comments
M
L