andreas

LlamaIndex + Pandas

Running into a bit of an annoying issue with LlamaIndex + Pandas

Sometimes the query that GPT / LlamaIndex comes up with queries (e.g.

Plain Text

df.loc[(slice('07/2023','10/2023'), 'restaurants'), 'breakdown']

) that return data from Pandas like this

Plain Text

07/2023  restaurants    {'Evroulla': 179.0, 'Foukou tou Yiakoumi': 60....
08/2023  restaurants    {'Evroulla': 181.0, 'Foukou tou Yiakoumi': 58....
09/2023  restaurants    {'Evroulla': 176.0, 'Foukou tou Yiakoumi': 59....
10/2023  restaurants    {'Evroulla': 48.0, 'Foukou tou Yiakoumi': 15.0...

but the data it returns is concatenated as shown above.

This is what the whole df looks like

Plain Text

                                                             breakdown  total
month   category                                                             
01/2023 Total        {'transport': 307, 'groceries': 499, 'restaura...   7354
        groceries    {'Alphamega Hypermarket': 368.0, 'Athienitis':...    499
        home                   {'Rent': 3987.0, 'Electricity': 1824.0}   5811
        restaurants  {'Evroulla': 178.0, 'Foukou tou Yiakoumi': 59....    447
        shopping         {'ZARA': 191.0, 'Amazon': 52.0, 'Ebay': 47.0}    290
        transport    {'ESSO': 186.0, 'Petrolina': 90.0, 'Panikos Ca...    307
02/2023 Total        {'transport': 307, 'groceries': 522, 'restaura...   7290
        groceries    {'Alphamega Hypermarket': 405.0, 'Athienitis':...    522
        home                   {'Rent': 3891.0, 'Electricity': 1813.0}   5704
        restaurants  {'Evroulla': 181.0, 'Foukou tou Yiakoumi': 61....    444
        shopping         {'ZARA': 207.0, 'Amazon': 52.0, 'Ebay': 54.0}    313
        transport    {'ESSO': 185.0, 'Petrolina': 90.0, 'Panikos Ca...    307

what to do, lol?

9 comments

aandreas

Anyone have experience with generating

Anyone have experience with generating graphs / charts on the fly with agents? My first thought was to build it a separate tool for each type of graph I want to include (pie chart, bar chart, line chart) and give the agent access to those tools. Thoughts? Anyone have experience with this?

1 comment

aandreas

hi, I'm trying to build an AI financial

hi, I'm trying to build an AI financial advisor. I have a bunch of mostly numerical / quantitative data. I'm trying to get it to answer the question "How much has my savings account grown by in the last 3 months"

Currently, when I ask this question, it passes a whole lot of context (it passes some daily info, some weekly info and all monthly info when it should really only be passing monthly info). Even with the unnecessary/wasteful context, it looks like OpenAI is returning the context correctly

Plain Text

The balance of your current account for the last 3 months was as follows:\\n\\n- 2023-08-01: $8741.37\\n- 2023-09-01: $9732.43\\n- 2023-10-01: $10569.80

That context is then injected into the next prompt correctly as well, but the output is wrong

Plain Text

  "content": "Your savings account has grown by €828.43 over the last 3 months, from €8741.37 to €10569.80. That\'s a growth of approximately 9.48%. Well done!"

this is the data I'm working with

Plain Text

{
    "account_number": "1234567890",
    "type": "savings",
    "frequency": "daily",
    "data": [
        {
            "date": "2023-01-01",
            "amount": 411.97,
            "changeFromYesterday": 11.97
        },
        {
            "date": "2023-01-02",
            "amount": 425.99,
            "changeFromYesterday": 14.02
        },
        {
            "date": "2023-01-03",
            "amount": 435.67,
            "changeFromYesterday": 9.68
        },

and

Plain Text

{
    "account_number": "1234567890",
    "type": "savings",
    "frequency": "monthly",
    "data": [
        {
            "date": "2023-01-01",
            "amount": 411.97,
            "changeFromLastMonth": 0
        },
        {
            "date": "2023-02-01",
            "amount": 753.52,
            "changeFromLastMonth": 341.55
        },
        {
            "date": "2023-03-01",
            "amount": 1028.29,
            "changeFromLastMonth": 274.77
        },

..and a similar file for weekly

So basically 3 json files, one containing the account balance at a monthly interval, the other at a weekly interval, and the other at a daily interval.

I'm just a bit confused as to how to approach structuring this. I currently am using a Vector Store index over the JSON documents that's being queried by a QueryEngineTool that's being controlled by a GPT 4 agent.

1) Is there any better way to structure the quantitative data other than JSON?
2) Am I going overkill by giving it monthly and weekly data as well? Should I just give it daily and let it handle everything?
3) Is Agent+QueryEngine the right approach here?
4) Is a VectorStoreIndex the best approach for timeseries type data?
5) Why is it passing so much unnecessary context?

13 comments

Find answers from the community

LlamaIndex + Pandas

Anyone have experience with generating

hi, I'm trying to build an AI financial