Find answers from the community

Updated 2 years ago

Hi all I got a very simple case text to

At a glance

The community member is having trouble moving from an in-memory SQLite database to a local PostgreSQL database for a text-to-SQL project. They are encountering issues with the GPTSQLStructStoreIndex and the get_single_table_info function, specifically related to the postgis extension. Other community members suggest creating a new database, adding the postgis extension, and trying a different approach with a fresh virtual environment. They provide sample code and SQL commands to help troubleshoot the issue, but there is no explicitly marked answer.

Useful resources
Hi all, I got a very simple case: text to sql. I did the tutorial using in memory sqlite and works great. I am trying to move to a local postgres, but I am having lots of troubles. Is this the right channel to ask for help?
s
L
s
99 comments
and add postgis extension
CREATE EXTENSION postgis;
psql -f db_sample
but in your example
can you translate a question to sql ?
because that is the bit failing
the problem is here
index = GPTSQLStructStoreIndex([], sql_database=sql_database)
right, but it's failing on a specific function according to the traceback

Plain Text
File "/Users/sdicola/PycharmProjects/llama_index/llama_index/langchain_helpers/sql_wrapper.py", line 52, in get_single_table_info
I have no problems with this
sql_database = SQLDatabase(db.sql_database.engine, schema="uk_data")
Plain Text
# table_name = "epc_certificates"
sql_database = SQLDatabase(db.sql_database.engine, schema="uk_data")
print(sql_database.get_single_table_info('epc_certificates'))
/Users/sdicola/PycharmProjects/llama_index/venv/bin/python /Users/sdicola/PycharmProjects/llama_index/examples/text_to_sql/demo.py
/Users/sdicola/PycharmProjects/llama_index/venv/lib/python3.9/site-packages/langchain/sql_database.pyπŸ’― SAWarning: Did not recognize type 'geography' of column 'geolocation'
self._metadata.reflect(
Traceback (most recent call last):
File "/Users/sdicola/PycharmProjects/llama_index/examples/text_to_sql/demo.py", line 20, in <module>
print(sql_database.get_single_table_info('epc_certificates'))
File "/Users/sdicola/PycharmProjects/llama_index/llama_index/langchain_helpers/sql_wrapper.py", line 52, in get_single_table_info
for column in self._inspector.get_columns(table_name):
File "/Users/sdicola/PycharmProjects/llama_index/venv/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 860, in get_columns
col_defs = self.dialect.get_columns(
File "<string>", line 2, in get_columns
File "/Users/sdicola/PycharmProjects/llama_index/venv/lib/python3.9/site-packages/sqlalchemy/engine/reflection.py", line 97, in cache
ret = fn(self, con, *args, **kw)
File "/Users/sdicola/PycharmProjects/llama_index/venv/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/base.py", line 3361, in get_columns
return self._value_or_raise(data, table_name, schema)
File "/Users/sdicola/PycharmProjects/llama_index/venv/lib/python3.9/site-packages/sqlalchemy/dialects/postgresql/base.py", line 3329, in _value_or_raise
raise exc.NoSuchTableError(
sqlalchemy.exc.NoSuchTableError: epc_certificates
let me try and install this postgis thing and see what happens lol
can you please cut and paste your script?
let me see if my enviroment is fucked
I am not familiar with python
I would start with a fresh venv, and don't install from source for now

Do this in a new directory

Plain Text
python -m venv venv
source venv/bin/activate
pip install llama-index


And here's my script that seems to work
Plain Text
from sqlalchemy import create_engine
import os
from llama_index import SQLDatabase, GPTSQLStructStoreIndex
from llama_index.readers.database import DatabaseReader
os.environ["OPENAI_API_KEY"] = 'no'
engine = create_engine('postgresql+psycopg2://postgres:mark90@localhost:5432/postgres')
sql_database = SQLDatabase(engine, schema="demo", include_tables=['accounts'])
index = GPTSQLStructStoreIndex([], sql_database=sql_database, table_name='accounts')
this is interesting now
Plain Text
db = DatabaseReader(
    scheme="postgresql",
    host="localhost",
    port="5432",
    user="demo",
    password="demo",
    dbname="demo",
    schema="uk_data"
)

# table_name = "epc_certificates"
sql_database = SQLDatabase(db.sql_database.engine, schema="demo")
print(sql_database.get_single_table_info('accounts'))
I have added all that stuff in the db in sql
Plain Text
db = DatabaseReader(
    scheme="postgresql",
    host="localhost",
    port="5432",
    user="demo",
    password="demo",
    dbname="demo",
    schema="uk_data"
)

# table_name = "epc_certificates"
sql_database = SQLDatabase(db.sql_database.engine, schema="demo")
print(sql_database.get_single_table_info('accounts'))

index = GPTSQLStructStoreIndex([], sql_database=sql_database)

query_engine = index.as_query_engine()
response = query_engine.query("How accounts there are?")
print(response.extra_info['sql_query'])
print(response)
Plain Text
Table 'accounts' has columns: user_id (INTEGER), username (VARCHAR(50)), password (VARCHAR(50)), email (VARCHAR(255)), created_on (TIMESTAMP), last_login (TIMESTAMP) and foreign keys: .
SELECT COUNT(*) FROM accounts;
 There are 5 accounts.

Process finished with exit code 0
so the problem must come from the extension
πŸ€” hmmm you might be right!
so you can create a dummy table as well
let me give you the sql
CREATE EXTENSION IF NOT EXISTS postgis;

CREATE SCHEMA IF NOT EXISTS demo;

CREATE TABLE demo.test_geo (
id SERIAL PRIMARY KEY,
name TEXT,
geom GEOMETRY(Point, 4326)
);

INSERT INTO demo.test_geo (name, geom)
VALUES ('Point 1', ST_SetSRID(ST_MakePoint(-110, 30), 4326)),
('Point 2', ST_SetSRID(ST_MakePoint(-111, 31), 4326)),
('Point 3', ST_SetSRID(ST_MakePoint(-112, 32), 4326)),
('Point 4', ST_SetSRID(ST_MakePoint(-113, 33), 4326)),
('Point 5', ST_SetSRID(ST_MakePoint(-114, 34), 4326));


```
there error is with postgid
Plain Text
db = DatabaseReader(
    scheme="postgresql",
    host="localhost",
    port="5432",
    user="demo",
    password="demo",
    dbname="demo",
    schema="uk_data"
)

# table_name = "epc_certificates"
sql_database = SQLDatabase(db.sql_database.engine, schema="demo")
print(sql_database.get_single_table_info('accounts'))

index = GPTSQLStructStoreIndex([], sql_database=sql_database)

query_engine = index.as_query_engine()
response = query_engine.query("How many points there are within lat -113 and long 34 with a 100km distance?")
print(response.extra_info['sql_query'])
print(response)
wait, but it worked for me lol
Plain Text
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "points" does not exist
LINE 2: FROM points 
             ^

[SQL: SELECT COUNT(*) 
FROM points 
WHERE ST_DWithin(geom, ST_MakePoint(-113, 34), 100000);]
(Background on this error at: https://sqlalche.me/e/20/f405)
have you added the table to the demo schema?
Yea, I installed postgis and ran the code you gave

Plain Text
>>> sql_database = SQLDatabase(engine, schema="demo", include_tables=['test_geo'])
>>> sql_database.get_single_table_info('test_geo')
"Table 'test_geo' has columns: id (INTEGER), name (TEXT), geom (NULL) and foreign keys: ."
and if you make the question
this is fucked up
Plain Text
sqlalchemy.exc.NoSuchTableError: demo.test_geo
Yea the query works for me
Attachment
image.png
how do you run your postgres db?
maybe the problem is that I am running that db locally using brew
and postgis is installed with brew as well
hmm I'm on windows. I was using pgAdmin lol
install postgis with "stack builder" (no clue what that is haha)
if you ask "what are the points within lat X and lang Y with a range of 100km?"
what do you get?
replace X and Y with any number
I am interested in the query
hmmm it tried its best lol
psycopg2.errors.InternalError_: LWGEOM_dwithin: Operation on mixed SRID geometries (Point, 4326) != (Point, 0)
Wrote this query but it has an error somewhere it seems

SELECT name, ST_Distance(geom, ST_MakePoint(-111, 30)) AS distance FROM test_geo WHERE ST_DWithin(geom, ST_MakePoint(-111, 30), 100000) ORDER BY distance;
honestly that's more impressive than I expected haha
You can probably add extra context description to the table to help it write more correct queries (again, I'm a noob with gis data, no idea what's going on here LOL)
thanks a lot πŸ™‚
so I destroyed everything and created a new db
Plain Text
db = DatabaseReader(
    scheme="postgresql",
    host="localhost",
    port="5432",
    user="postgres",
    password="demo",
    dbname="postgres",
)

# table_name = "epc_certificates"
sql_database = SQLDatabase(db.sql_database.engine, schema="demo")
print(sql_database.get_table_info(table_names=["test_geo"]))
Plain Text
sqlalchemy.exc.CompileError: (in table 'test_geo', column 'geom'): Can't generate DDL for NullType(); did you forget to specify a type on this Column?

Process finished with exit code 1
Plain Text
print(sql_database.get_usable_table_names())
{'test_geo', 'spatial_ref_sys'}
so this time the table can be seen
but the moment one tries this
index = GPTSQLStructStoreIndex([], sql_database=sql_database, table_name="test_geo")
so there is a problem with postgis
ok this is even more interesting now
I deleted the schema demo
Plain Text
db = DatabaseReader(
    scheme="postgresql",
    host="localhost",
    port="5432",
    user="postgres",
    password="demo",
    dbname="postgres",
)

# table_name = "epc_certificates"
sql_database = SQLDatabase(db.sql_database.engine, schema="demo")

index = GPTSQLStructStoreIndex([], sql_database=sql_database, table_name="test_geo")
this does not even error
Bruh what is going on 😡 it's so weird because with postgis I was able to connect and query without issue πŸ˜…
it is very strange
I am mocing to vscode
Add a reply
Sign up and join the conversation on Discord