The community member has created metadata using SQLAlchemy and is trying to use the LlamaIndex library to create an index for a table named "bug_flow". However, they are encountering a KeyError when trying to create the index. The community members have tried various approaches, such as including specific tables or leaving the include_tables parameter empty, but the issue persists.
The community members have also noticed that the SQLDatabase object is only capturing the tables they have created using SQLAlchemy, and not the tables that are already stored in the specified schema of the PostgreSQL database. They are wondering why the SQLDatabase is not selecting the existing tables.
The community members have not found a clear solution to the problem yet, and the discussion is ongoing in the comments.
Hey, I have created metadata by following code. And it is giving a dictionary of all the tables which are in the schema. metadata_obj = MetaData(bind = pg_engine, schema = 'schema_name') metadata_obj.reflect()
get the list of table names in the database
table_names = metadata_obj.tables.values()
But, on using llamaindex with the following code: from llama_index import SQLDatabase, GPTSQLStructStoreIndex sql_database = SQLDatabase(pg_engine, include_tables=["city_stats"], schema = 'schema_name') index = GPTSQLStructStoreIndex( [], sql_database=sql_database, table_name="table_name" ) It is giving me this error, --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-49-1268fb2a8c42> in <module> 1 from llama_index import GPTSQLStructStoreIndex ----> 2 index = GPTSQLStructStoreIndex( 3 [], 4 sql_database=sql_database, 5 table_name="bug_flow",
/usr/local/lib/python3.8/dist-packages/llama_index/indices/struct_store/sql.py in init(self, documents, index_struct, llm_predictor, sql_database, table_name, table, ref_doc_id_column, sql_context_container, **kwargs) 71 self.sql_database = sql_database 72 if table is None: ---> 73 table = self.sql_database.metadata_obj.tables[table_name] 74 # if ref_doc_id_column is specified, then we need to check that 75 # it is a valid column in the table
KeyError: 'bug_flow' ----------------------------------------------------------------------------------------------------- Could anyone please help me with this. Thanks.
hi @Nandini ! this is because when you initialize the SQLDatabase object, you have an include_tables argument that says ["city_stats"]. If you leave it blank it'll include all tables in your schema. If not it will only limit to those tables specified
Hi, I also tried with giving empty list to the include_tables parameter, but no difference. Also, in sql_database there are only tables which i have created in postgres using sqlalchemy. I am wondering why SQLDatabase is not selecting tables which are already stored in the specified schema.
Hi, so I tried this but then it is only selecting tables that I have created through sqlalchemy and not selecting the tables which are already stored in the specified schema of postgres. When I am using this code sql_databse.get_table_names(), then it is showing the names of the tables which I have created, but then while creating the index it is giving the same previous error. I am using this code to create sql_database. 'sql_database= SQLDatabase(pg_engine, schema = 'schema_name')' same issue in creating metadata. metadata_obj = MetaData(bind = pg_engine, schema = 'schema_name') metadata_obj.reflect()
yes. Also, index is not working for tables which sql_database is capturing. code is 'index = GPTSQLStructStoreIndex( [], sql_database=sql_databse, table_name = 'city_stats' )'