hi, I'm trying to build an AI financial advisor. I have a bunch of mostly numerical / quantitative data. I'm trying to get it to answer the question "How much has my savings account grown by in the last 3 months"
Currently, when I ask this question, it passes a whole lot of context (it passes some daily info, some weekly info and all monthly info when it should really only be passing monthly info). Even with the unnecessary/wasteful context, it looks like OpenAI is returning the context correctly
The balance of your current account for the last 3 months was as follows:\\n\\n- 2023-08-01: $8741.37\\n- 2023-09-01: $9732.43\\n- 2023-10-01: $10569.80
That context is then injected into the next prompt correctly as well, but the output is wrong
"content": "Your savings account has grown by β¬828.43 over the last 3 months, from β¬8741.37 to β¬10569.80. That\'s a growth of approximately 9.48%. Well done!"
this is the data I'm working with
{
"account_number": "1234567890",
"type": "savings",
"frequency": "daily",
"data": [
{
"date": "2023-01-01",
"amount": 411.97,
"changeFromYesterday": 11.97
},
{
"date": "2023-01-02",
"amount": 425.99,
"changeFromYesterday": 14.02
},
{
"date": "2023-01-03",
"amount": 435.67,
"changeFromYesterday": 9.68
},
and
{
"account_number": "1234567890",
"type": "savings",
"frequency": "monthly",
"data": [
{
"date": "2023-01-01",
"amount": 411.97,
"changeFromLastMonth": 0
},
{
"date": "2023-02-01",
"amount": 753.52,
"changeFromLastMonth": 341.55
},
{
"date": "2023-03-01",
"amount": 1028.29,
"changeFromLastMonth": 274.77
},
..and a similar file for weekly
So basically 3 json files, one containing the account balance at a monthly interval, the other at a weekly interval, and the other at a daily interval.
I'm just a bit confused as to how to approach structuring this. I currently am using a Vector Store index over the JSON documents that's being queried by a QueryEngineTool that's being controlled by a GPT 4 agent.
1) Is there any better way to structure the quantitative data other than JSON?
2) Am I going overkill by giving it monthly and weekly data as well? Should I just give it daily and let it handle everything?
3) Is Agent+QueryEngine the right approach here?
4) Is a VectorStoreIndex the best approach for timeseries type data?
5) Why is it passing so much unnecessary context?