I am trying to replicate the PandasQueryEngine example from the docs (with non-default LLMs and embeddings, if it matters):
import pandas as pd
from llama_index.query_engine import PandasQueryEngine
# Test on some sample data
df = pd.DataFrame(
{
"city": ["Toronto", "Tokyo", "Berlin"],
"population": [2930000, 13960000, 3645000],
}
)
query_engine = PandasQueryEngine(df=df, verbose=True)
response = query_engine.query(
"What is the city with the highest population?",
)
but I keep getting an error
> Pandas Instructions:
df['city'][df['population'].idxmax()]
Traceback (most recent call last):
File "C:\Users\Stephane\miniconda3\envs\llamaindex\lib\site-packages\llama_index\query_engine\pandas_query_engine.py", line 61, in default_output_processor
tree = ast.parse(output)
File "C:\Users\Stephane\miniconda3\envs\llamaindex\lib\ast.py", line 50, in parse
return compile(source, filename, mode, flags,
File "<unknown>", line 2
df['city'][df['population'].idxmax()]
IndentationError: unexpected indent
> Pandas Output: There was an error running the output as Python code. Error message: unexpected indent (<unknown>, line 2)
Any idea why?
Side-question: is PandasQueryEngine the best way to build a RAG app when the information is to be found in a table?