I'm encountering an issue with the pandas query engine behaving differently in various environments. The setup: I have a simple pandas query engine code. My dataframe has a column named "Operation Name".
In Jupyter Notebook: The code works perfectly. When I input the query "What was the first activity?", it correctly interprets "activity" as the "Operation Name" column. It searches the "Operation Name" column and returns the expected results.
However, in a .py file or when exposed as an API: The same code doesn't work as expected. Using the same query "What was the first activity?", the query engine literally searches for an "Activity" column. Since there's no "Activity" column in my dataframe, it throws a KeyError. The error message is: "> Pandas Output: There was an error running the output as Python code. Error message: 'Activity'"
It seems the query engine's ability to map natural language to the correct column name is working in Jupyter, but failing in other environments. Do you have any insights on why this might be happening or how to resolve it?
below is my code:
from llama_index.llms.azure_openai import AzureOpenAI from llama_index.experimental.query_engine.pandas.pandas_query_engine import PandasQueryEngine import pandas as pd #llm = AzureOpenAI()- I have given the details in my code df= pd.read_csv("sample.csv") query_engine = PandasQueryEngine(df=df,llm=llm,verbose=True,synthesize_response=True) response = query_engine.query("What was the first activity") print(response.response)`