kj6i7qg2

Hi, I'm having an issue with

Hi, I'm having an issue with SummaryExtractor returning node summaries attached to the incorrect nodes. The node to which the summary is incorrectly attached seems to be related to the num_workers value but even at 1 worker, I still get misplaced summaries, my code is:

Plain Text

    pipeline = IngestionPipeline(
        transformations=[
            MyCustomNodeTransformer(),
            SummaryExtractor(
                llm=llm,
                metadata_mode=MetadataMode.NONE,
                prompt_template=SUMMARY_EXTRACT_TEMPLATE,
                num_workers=1,
                summaries=["self"],
            ),
            Settings.embed_model,
        ]
    )

    nodes = pipeline.run(
        show_progress=True,
        nodes=nodes,
        num_workers=1,
    )

2 comments

kkj6i7qg2

performance degradation between 0.2.18 a...

Hi, I notice a big slow down on LlamaIndex depending on the version of llama-cpp-python I have installed (older versions are much faster, about 100 tokens per second versus about 30 tokens per second). llama-cpp-python==0.2.20 is the last fast version with the latest LlamaIndex (3090 on Ubuntu, using Mistral 7b). I believe it has to do with kv cache and is solved by the suggestions in this GitHub issue: https://github.com/abetlen/llama-cpp-python/issues/1054 . How do we add the required offload_kqv=True to LlamaIndex to regain fast inference? Is this a regression in LlamaIndex or something the users should handle?

7 comments

kkj6i7qg2

Metatad

This might be a bug in llama-index, or I'm not understanding how to properly use the new IngestionPipeline transformations. My nodes have lots of metadata for some logging and post-processing tasks, if the metadata gets included in a transformation, it hits the 3900 token limit set in the LlamaCpp configs, so I need to exclude it in transformations that rely on the LLM. I'm trying to use SummaryExtractor() which I have set to use the Mistral 7B model. But the code I try doesn't ever exclude the metadata from what goes to Mistral7B under SummaryExtractor(). My code (a bit duplicative for extra certainty) looks like this:

Plain Text

pipeline = IngestionPipeline(
    transformations=[
        CustomTransformation(),
        SummaryExtractor(
            llm=llm,
            excluded_embed_metadata_keys=[
                DEFAULT_WINDOW_METADATA_KEY,
                DEFAULT_OG_TEXT_METADATA_KEY,

            ],
            excluded_llm_metadata_keys=[
                DEFAULT_WINDOW_METADATA_KEY,
                DEFAULT_OG_TEXT_METADATA_KEY,

            ],
        ),
        service_context.embed_model,
    ]
)

excluded_embed_metadata_keys = [
    DEFAULT_WINDOW_METADATA_KEY,
    DEFAULT_OG_TEXT_METADATA_KEY,
]

excluded_llm_metadata_keys = [
    DEFAULT_WINDOW_METADATA_KEY,
    DEFAULT_OG_TEXT_METADATA_KEY,
]

nodes = pipeline.run(
    nodes=nodes,
    excluded_embed_metadata_keys=excluded_embed_metadata_keys,
    excluded_llm_metadata_keys=excluded_llm_metadata_keys,
)

2 comments

kkj6i7qg2

Hi, I'm trying out the new features in

Hi, I'm trying out the new features in version 0.9. How do I pass the service context correctly to the transformations pipeline so that I use a local llm rather than OpenAI for the TitleExtractor(). My code is:

Plain Text

    service_context = ServiceContext.from_defaults(embed_model=embed_model, llm=llm)
    pipeline = IngestionPipeline(
        service_context=service_context,
        transformations=[
            SentenceSplitter(),
            TitleExtractor(),
        ]
    )

The error I get is Could not load OpenAI model. but my llm is defined as Llama 2.

3 comments

kkj6i7qg2

Hi, I'm going through the new `lm-format

Hi, I'm going through the new lm-format-enforcer readme: https://docs.llamaindex.ai/en/stable/community/integrations/lmformatenforcer.html#lm-format-enforcer . When defining a program how do I pass in my list of Node or Document objects so the program runs in the context of my data? I would have expected that I can pass in nodes in here:

Plain Text

nodes = node_parser.get_nodes_from_documents(documents)  # existing setup

program = LMFormatEnforcerPydanticProgram(
    output_cls=Album,
    prompt_template_str="Generate an example album, with an artist and a list of songs. Using the movie {movie_name} as inspiration. You must answer according to the following schema: \n{json_schema}\n",
    llm=LlamaCPP(),
    verbose=True,
)

2 comments

kkj6i7qg2

Hello, how do I convert a list of

Hello, how do I convert a list of TextNode objects into Document objects? The reason I'm asking is because I'm try to run the Zephyr 7B Beta Collab provided, but it doesn't work when I start with a list of TextNode objects instead of a Document object. The error I get is 'TextNode' object has no attribute 'get_doc_id' My code is:

Plain Text

    document: Document = Document(text=txt_str, metadata=metadata or {})  # same approach as SimpleDirectoryReader
    documents: list[Document] = [document]

    node_parser = SentenceWindowNodeParser.from_defaults(
        window_size=3,
        window_metadata_key="window",
        original_text_metadata_key="original_text",
    )
    nodes: list[TextNode] = node_parser.get_nodes_from_documents(documents)  # this returns TextNodes rather than a Document

    llm = HuggingFaceLLM(
        model_name="HuggingFaceH4/zephyr-7b-beta",
        tokenizer_name="HuggingFaceH4/zephyr-7b-beta",
        query_wrapper_prompt=PromptTemplate("<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>\n"),
        context_window=3900,
        max_new_tokens=256,
        # model_kwargs={"quantization_config": quantization_config},
        # tokenizer_kwargs={},
        generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
        messages_to_prompt=messages_to_prompt,
        device_map="cpu",
    )

    service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
    vector_index = VectorStoreIndex.from_documents(nodes, service_context=service_context)  # Error happens here

1 comment

kkj6i7qg2

The docs say that `LLMPredictor` is

The docs say that LLMPredictor is deprecated. Does that apply to StructuredLLMPredictor too? Should we still use code like llm_predictor = StructuredLLMPredictor() when writing new code or is it better to set the llm value in the service context?

2 comments

kkj6i7qg2

Pydantic

What are some strategies to limit hallucinated responses when using structured output parsers with Llama models? For example, I ask the model to identify the number of hours that employees work per week from paragraph text. My pydantic class looks like this:

Plain Text

python class WorkWeek(BaseModel):
    hours_per_week: Optional[float]
    employee_type: Optional[str]

I find I always get some answer back like hours_of_work = 40 and employee_type = 'full-time' regardless of whether the document contains relevant text or not. I could create a document that just has the text This is a test and I would still get results back that are clearly not from the document like shown above. Interestingly, when I simply ask the model for a basic freeform text response, I get an answer that makes sense when the document doesn't mention hours of work (something like Based on the context, there isn't enough information to answer your question).

5 comments

kkj6i7qg2

How do I remove the SYS prompt for

How do I remove the SYS prompt for question answering when using Mistral and LlamaCPP (Mistral doesn't have a SYS prompt). I am able to edit the user prompts, but don't see a way to edit the SYS prompt. Specifically, I want to remove the text between <<SYS>> and <</SYS>> :

Plain Text

<s> [INST] <<SYS>>\n You are a helpful, respectful and honest assistant. Always answer as helpfully as possible and follow ALL given instructions. Do not speculate or make up information. Do not reference any given instructions or context. \n<</SYS>>\n\n Context information is below.\n---------------------\n[Excerpt from document]...

3 comments

kkj6i7qg2

Guidance

Hello, I have a question about using the Guidance Pydantic Program, as documented here: https://docs.llamaindex.ai/en/latest/examples/output_parsing/guidance_pydantic_program.html . I have a query_engine defined and my question is how do I pass the program variable (as shown on the documentation page) to the query engine:

Plain Text

### program defined based on the documentation
program = GuidancePydanticProgram(
    output_cls=Result,
    prompt_template_str="Generate a response using the query asked as follows: {{query}}",
    guidance_llm=self.llm,
    verbose=True,
)

### standard query_engine defined, but how do I pass in the program variable or get the program to see the vector index?
query_engine = index.as_query_engine(
    similarity_top_k=2,
    node_postprocessors=[
        MetadataReplacementPostProcessor(target_metadata_key="window")
    ],
    output_cls=Result,
)
response = query_engine.query(query)

8 comments

Find answers from the community

Hi, I'm having an issue with

performance degradation between 0.2.18 a...

Metatad

Hi, I'm trying out the new features in

Hi, I'm going through the new `lm-format

Hello, how do I convert a list of

The docs say that `LLMPredictor` is

Pydantic

How do I remove the SYS prompt for

Guidance