galvangjx

Log inLog into community

Find answers from the community

Home

Members

galvangjx

Offline, last seen 2 weeks ago

Joined September 25, 2024

ggalvangjx

Querying a Multimodal LLM Continuously While Retaining Memory

hello there, i am looking for a way to query a multimodal llm continously while retaining memory. much like simulating an user giving chatgpt an image and the first prompt and subsequently follow up prompts asking question about the image. is there a cookbook for this in llamaindex?

14 comments

ggalvangjx

Leveraging LlamaIndex's Workflows: Parsing and Indexing Considerations

hello there, I have some questions about LlamaIndex's Workflows.

Suppose I want to use LlamaParse for the first event and indexing as second event. Is there any pros to use LlamaParse asynchronously knowing that indexing has to happen after parsing?

4 comments

ggalvangjx

Subclassing Workflows: Understanding the Intuition

hello there, I am going through the tutorials of Subclassing workflows here https://docs.llamaindex.ai/en/stable/understanding/workflows/subclass/

I can execute the codes fine. What I don't understand fully is the intuition of subclassing. CustomWorkflow inherits MainWorkflow and step_two now emits Step2BEvent instead of Step3Event. When we do that, are we overwriting the functionality of step_two of MainWorkflow? And to execute the workflow end to end, why are we running CustomWorkflow but not MainWorkflow? I would thought MainWorkflow to encapculate all subclassed workflows (or child workflow if I'm understanding correctly)

2 comments

ggalvangjx

Schema

hello there, i have a question about token cost when using structured output. when i provide my own pydantic class during inference, does the schema itself (without text generated by the llm) cost any tokens when the llm generates the whole structured output?

Based on the example below, does outputting the schema structure (e.g. the keys - "car_brands", "brand", "country", parenthesis, commas, etc.) also cost any tokens? Or the cost are solely on the values ("Toyota", "Japan", 1937, etc) that the llm generates? Thanks!

Plain Text

{
  "car_brands": [
    {
      "brand": "Toyota",
      "country": "Japan",
      "founded": 1937
    },
    {
      "brand": "BMW",
      "country": "Germany",
      "founded": 1916
    },
    {
      "brand": "Ford",
      "country": "USA",
      "founded": 1903
    }
  ]
}

2 comments

ggalvangjx

llama_parse/examples/multimodal/multimod...

hello there, i have some questions about preparing your knowledge base for a multimodal rag application.

referencing from this guide - https://github.com/run-llama/llama_parse/blob/main/examples/multimodal/multimodal_rag_slide_deck.ipynb
it iterates through each page and creates a TextNode, and at the same time, adds the page number and image path of the image as metadata.

For my case, I am using MarkdownElementNodeParser which separates texts and tables into IndexNode and BaseNode. Similarly I will like to add page number and image path into these nodes' metadata. But the sequence of the nodes are already jumbled up from line 2 onwards. So how can I still add the page number and image path in them? Thanks

Plain Text

[1] node_parser = MarkdownElementNodeParser(llm=llm)
[2] nodes = node_parser.get_nodes_from_documents([document])
[3] base_nodes, objects = node_parser.get_nodes_and_objects(nodes)

6 comments

ggalvangjx

Tools for Agent: Efficient Utilization and Prompt Guidance

hello there, I have written 2 sets of tools for my agent using FunctionCallingAgentWorker. The first set of tool includes 3 QueryEngineTool and the second set includes 1 FunctionTool (pydantic base model, does simple addition on any numbers of input floats).

I put them all into a list and pass it to my agent. During query time, the agent is always using the first set of tools and not using the second set at all. In my prompt, do I need to include statements like "use this tool when doing this task" sort of sentence?

11 comments

ggalvangjx

llama_parse/examples/multimodal/multimod...

hello there, I have successfully ran through this notebook -> https://github.com/run-llama/llama_parse/blob/main/examples/multimodal/multimodal_rag_slide_deck.ipynb
Instead of saving the images locally, can I store them on Azure blob storage and during querying time does the images get read from Azure blob storage? I just want to keep the implementation the same instead of saving the images locally, I move them to a cloud storage.

18 comments

ggalvangjx

Agentic

Hello all, currently I want to investigate the use of multi-agents for my use case of invoice data extraction. I already have a standard rag pipeline set up. Just wondering if I can extend my application and make it agentic, and if yes, how can I do that?

I searched the internet but I can’t find much information about it. It’s all about training models and I don’t want those solutions.

1 comment

ggalvangjx

Constructing a Document from a List of TextNode

hello there, is there a way to construct a Document from a list of TextNode?

I have a markdown document, from LlamaParse, where I break them down into a list of nodes using MarkdownNodeParser, utilising node.metadata['Header_1] as a way of filtering those nodes by the md headers from my document, and do text amendment.

Now that I have updated llama-index-core, node.metadata dictionary is missing the Header_1. What I do now is manually add them back, but I'm stuck with a list of updated TextNode, not knowing how to convert them into a Document.

9 comments

ggalvangjx

Building real-time image transmission into a naive rag pipeline and query engine

hello there, i have built a naive rag pipeline and a query engine on top of it. During querying time, I will like to also send some images in real time. How can this be done?

2 comments

ggalvangjx

hello there, while setting up my rag +

hello there, while setting up my rag + reranker pipeline I noticed it is taking quite awhile to instantiate query engines. Currently it is taking around 20 to 30 secs. Just curious if this is normal or it is because of how the reranker is being created within the query engine?

Plain Text

node_parser = MarkdownElementNodeParser(num_workers=8, show_progress=False)
nodes = node_parser.get_nodes_from_documents([document])
base_nodes, objects = node_parser.get_nodes_and_objects(nodes)
index =  VectorStoreIndex(nodes=base_nodes+objects)
recursive_query_engine = index.as_query_engine(similarity_top_k=3,node_postprocessor[FlagEmbeddingReranker(top_n=2, model=RERANKER_MODEL)], verbose=False)

12 comments

ggalvangjx

hello, is there a way method to retrieve

hello, is there a way method to retrieve the output tokens of the response coming from a query_engine or do we have to calculate ourselves? I am using OpenAI LLMs

2 comments

ggalvangjx

Pandas

apparently llama-index do no have support for creating query pipeline/engine over multiple pandas dataframe yet! Just wondering if this is in the roadmap?

2 comments

ggalvangjx

Hello all, I have created an ReAct agent

Hello all, I have created an ReAct agent and given some tools to it. Now I will like to include a parser such that the response is always just a python dictionary instead of a sentence. Does anyone know how can I do that?

ReActAgent.from_tools has a output_parser argument however I had no luck with it.

This is current set up and I will be adding on more tools to extract more data:

Plain Text

recursive_query_engine = recursive_index.as_query_engine(
    similarity_top_k=5,
    node_postprocessors=[reranker],
    verbose=True
)

class DocumentTypeResponse(BaseModel):
    """Data model for the document type"""
    document_type: str

document_type_identifier = QueryEngineTool(
    query_engine=recursive_query_engine,
    metadata=ToolMetadata(
        name='document_type',
        description=(
            "Only use this tool when required. "
            "Answer to question relating to document type. "
            "Identify document types as either purchase order or invoice. "
        ),
        fn_schema=DocumentTypeResponse
    )
)

context_document_agent = """\
You are an expert administraive assistant who specialized in answering questions about document. 
The questions mainly revolves around extracting key information from either an invoice or purchase order. 
Only use the necessary tool to answer the questions. Only use more tools when needed.
"""

document_agent = ReActAgent.from_tools(
    tools=document_type_identifier,
    verbose=True,
    context=context_document_agent,
)

1 comment

ggalvangjx

Hello everyone, I have been trying to

Hello everyone, I have been trying to follow to this cookbook (https://docs.llamaindex.ai/en/latest/examples/cookbooks/llama3_cookbook/#setup-llm-using-huggingfacellm) to just get a basic set up on running Llama3 however I have been running into the following issues.

I have followed to the instructions to install the latest version of the packages but still having the same errors when doing quantization to 4bit.

Plain Text

Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes`

If im not doing quantization and just running the code as in the cook book. I got this

Plain Text

TypeError: BFloat16 is not supported on MPS and ImportError:

Just want to know if there's anyone experience the same issues when trying it out on a similar set up as mine? I am on M1 Mac Pro.

The portion where I am having troubles is over here

Plain Text

# set up llm using HuggingFaceLLM
import torch
from llama_index.llms.huggingface import HuggingFaceLLM
from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

llm = HuggingFaceLLM(
    model_name=model_name,
    model_kwargs={
        "token": hf_token,
        "torch_dtype": torch.bfloat16,
        "quantization_config": quantization_config
    },
    generate_kwargs={
        "do_sample": True,
        "temperature": 0,
        "top_p": 0.9,
    },
    tokenizer_name=model_name,
    tokenizer_kwargs={"token": hf_token},
    stopping_ids=stopping_ids,
)

2 comments

ggalvangjx

Hello all, I have a RAG application that

Hello all, I have a RAG application that returns structured data over documents such as invoices and purchase orders. My aim is to retrieve 2 sets of information from those documents. 1) Header information like supplier details, customer details, document date, tax amounts, total amounts, etc. 2) Table information of line items like description, quantity, unit prices, sub-total amounts.

After I parse the document, I split them into nodes, chunk them and build a retriever engine with reranker on top of it. Somewhat similar to this example here: https://docs.llamaindex.ai/en/stable/examples/node_postprocessor/FlagEmbeddingReranker/

I am currently sending 2 user queries (one to retrieve header information and another table information). In each of those prompts, I am trying to query all the information that exist in either the header or table. In this case, does having the reranker in the retrieval engine makes a difference in retrieval accuracy?

For example I have 2 nodes for header information. One of the node contains the document date, and the other contains the total amounts. If in my query, I am asking for both information at the same time. Both nodes will be retrieve and then passed to the LLM for generation right? If that is the case, having the reranker seems redundant?

10 comments

ggalvangjx

Azure

Is anyone using Azure OpenAI deployed model for their applications? The response time is very slow!

5 comments

ggalvangjx

hello there, i have a question about

hello there, i have a question about enhancing data extraction quality from scanned documents over a typical rag + reranker pipeline. i am currently using LlamaParse to convert tabular data into markdown table, then indexing them. There can be a case where the tabular data are not being converted properly (e.g. table fonts are too small, document not scanned properly by people, etc), thus making the markdown table unusable. Since I am using gpt-4o in my pipeline,

questions:

Can I also extract the table as an image and put them in my pipeline? So if markdown table us unusable, gpt-4o can also look into the image for data extraction
Do I also have to manage how I chunk the markdown table and image in sequence if I have more than 1 table?

10 comments

ggalvangjx

Nodes

hello there, I am trying to replace text within a node of a markdown document however having some trouble. Is there anything wrong with the code below? the text are just not changed.

sample document:

Plain Text

# A
- text for A

# B
- text for b

code:

Plain Text

from llama_index.core.node_parser import MarkdownNodeParser
parser = MarkdownNodeParser()

for page in document:       
    page_nodes = parser.get_nodes_from_documents([page])
    for node in page_nodes:
        if node.metadata['Header_1'] == 'B':
            node.text = 'New text for B'
        print(node.text) # text is changed
    print(page.text) # text is not changed!

7 comments

ggalvangjx

llama_index/docs/docs/examples/structure...

hello there, i was going through the structured output notebook with my own data and came across an error which i do not understand. My rag pipeline follows exactly like section 2. Plug into RAG Pipeline. during querying time i came across the following error. can anyone provide some insights and how did it occured?

this is the notebook https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/structured_outputs/structured_outputs.ipynb

Plain Text

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[20], line 1
----> 1 response_header = header_query_engine.query(new_header_query)

File <hidden_path>\.venv\lib\site-packages\llama_index\core\instrumentation\dispatcher.py:260, in Dispatcher.span.<locals>.wrapper(func, instance, args, kwargs)
    252 self.span_enter(
    253     id_=id_,
    254     bound_args=bound_args,
   (...)
    257     tags=tags,
    258 )
    259 try:
--> 260     result = func(*args, **kwargs)
    261 except BaseException as e:
    262     self.event(SpanDropEvent(span_id=id_, err_str=str(e)))

File <hidden_path>\.venv\lib\site-packages\llama_index\core\base\base_query_engine.py:52, in BaseQueryEngine.query(self, str_or_query_bundle)
     50     if isinstance(str_or_query_bundle, str):
     51         str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> 52     query_result = self._query(str_or_query_bundle)
     53 dispatcher.event(
     54     QueryEndEvent(query=str_or_query_bundle, response=query_result)
     55 )
     56 return query_result
...
--> 302 content = content_template.format(**relevant_kwargs)
    304 message: ChatMessage = message_template.copy()
    305 message.content = content

KeyError: "' Item No"

34 comments

ggalvangjx

hello there, I am currently facing a

hello there, I am currently facing a road block with OpenAI llm output token limit. what i'm doing now is basically retrieving line item from tables in documents like invoice in a structured output. the format of the structured output is a list of dictionaries. So each dictionary is 1 line item and within the dictionary, keys are the table headers and values are the row values.

the road block comes when i'm retrieving from a very large table using a single query call to the LLM (i'm using recursive query engine). for example, the table has 50 line items and the token limit is reached when im retrieving up till around the 30th line item. In LlamaIndex, I would like to know if currently there is any way for query engine to "stash" the first response, and send another api call or however many times I want till all line items are retrieved? Or do I have to build a for loop myself to pass in the first response as a prompt for the second query to retrieve the remaining line items?

3 comments

ggalvangjx

hello there, will this event be recorded

hello there, will this event be recorded? - Unlock the Power of Your Data and LLMs with LlamaIndex 🦙and MLflow

1 comment

ggalvangjx

Hello there, I have a Document object

Hello there, I have a Document object where I have split them into nodes via MarkdownNodeParser to amend some texts in one of the nodes. Now I want to convert it back to a Document, how can I do that?

3 comments

ggalvangjx

Retireve

Hello there, I have a vanilla rag & reranker pipeline set up. my rag application is suppose to retrieve all line items from an invoice with an unknown number of line items. Could range from 1 line item to tens or hundreds of line items.

In my prompt, I have explicitly stated to retrieve all line items from my index, however, at all times the retrieval is always incomplete. Suppose the retrieval fits into the LLM response context window, what can I do to ensure it always return to me all line items in my index?

12 comments

ggalvangjx

I construct my tools by doing the

I construct my tools by doing the following:

Plain Text

# create a set of tools on top of the query engine
query_engine_tools = [
    QueryEngineTool(
        query_engine=recursive_query_engine,
        metadata=ToolMetadata(
            name = 'document_type_finder',
            description = (
                '''
                This tool finds the type of document and outputs the answer. 
                This tool will look for obvious features in the document such as words like "invoice" or "tax invoice" for invoice and words like "purchase order", "sales order" for purchase order.
                Finally, this tool only outputs answer as "invoice" or "purchase order"
                '''
            )
        )
    ),
    QueryEngineTool(
        query_engine=recursive_query_engine,
        metadata=ToolMetadata(
            name = 'document_origin_finder',
            description = (
                '''
                This tool finds the country of origin of the document
                This finder will first identify the country of origin of the document sender by looking into its company name, address, or any other relevant information to derive the origin of country.
                Finally, this tool format the answer following to ISO 3166-1 alpha-3 and outputs it.
                '''
            )
        )
    )
]

# create a RAG ReAct QueryEngineTool agent
agent = ReActAgent.from_tools(tools=query_engine_tools, llm=llm, verbose=True)

response = agent.chat(
    '''
    What is the document type and origin? Use any of the tools provided to you.
    '''
)

I am not sure why but the agent thought process and answer is always about no document provided.

Plain Text

# output
>>>Thought: The user has not provided the document text for analysis. I need to ask for the document text to proceed with the analysis using the appropriate tool.
Answer: Could you please provide the text of the document you would like me to analyze for its type and origin?

6 comments