Find answers from the community

Updated 2 years ago

Another question is it possible to load

At a glance
Another question, is it possible to load existing sqlalchemy metadata from an existing DB (using reflections https://docs.sqlalchemy.org/en/20/core/reflection.html#reflecting-all-tables-at-once) into a GPTSQLStructStoreIndex? Use case is wanting to run text to SQL on an already existing SQL database
j
s
h
28 comments
i think so! see this thread https://github.com/jerryjliu/gpt_index/issues/204 - let me know if using a blank document list would help
Will take a look, thanks!
@jerryjliu0 if its just text to sql and not ingestion, can probably just use the langchain chain right? didnt you say it was just doing that under the hood?
def possible to use the langchain one. honestly either one works if you're just looking for text-to-sql (we also reuse the sql alchemy wrapper from langchain) - the index leaves open the possibility that you'd want to add more unstructured docs later
thanks @hwchase17 I'll try that out too. Was hoping to use gptindex to help reduce prompt size since some of my db schemas are quite large
ah got it, i didnt know gptindex already supported that! we've got some ideas for using chains to reduce that pain, but probably wont be in for a few days
tbf i'm not sure if gptindex supports that but was planning on testing out to see if the indexes reduce prompt size
@hwchase17 are you a contributor to langchain?
@scuba.steve.0 oh interesting, you're looking to create a more condensed representation of the table schema?
gpt index doesn't explicitly support that atm, but i can help you brainstorm a few solutions
Currently we just use the table schema directly, but I can add a wrapper function for you to use a compressed representation! And you can then use gpt index or langchain to synthesize the compressed representation
@jerryjliu0 ooo amazing! How is the table schema currently represented in the prompt when using GPTSQLStructStoreIndex?
Actually, this is interesting. GPT index data structures themselves can be used over your table schema if your schema is very big. I can think about adding a "composability" feature to the SQL index where you can 1) define a gpt index data structure (e.g. the list index) over your table schema, 2) use that data structure in the SQL index to extract fields from unstructured data
yea I like that a lot. Another question (sorry to bombard you here, just very excited about you've built), are there plans to support passing multiple tables into a GPTSQLStructStoreIndex? For example I have multiple tables in my db schema and want to run text to sql that can potentially write joins with several tables.
maybe overkill for gptindex, looking at langchain docs seems like I can do something like that in langchain. Would be great to combine that feature with gptindex's indexing feature
@scuba.steve.0 yes! Definitely plans here as well. I want to expand GPT Index's support for structured data, and doing joins + being able to infer data for multiple tables are all potential options.
Would love your feedback in the process if you use this feature more
@jerryjliu0 absolutely!
also to help me better understand, what is being passed in as schema in the prompt when using GPTSQLStructStoreIndex? Referencing this line:
https://github.com/jerryjliu/gpt_index/blob/e685bbf6d04aed0b1f17c07bf0bc9a6a19b9973e/gpt_index/prompts/default_prompts.py#L176

Is it the actual metadata table representation eg something like:
Plain Text
Table(
    table_name,
    metadata_obj,
    Column("city_name", String(16), primary_key=True),
    Column("population", Integer),
    Column("country", String(16), nullable=False)

?
yep! it's a formatted string πŸ™‚
let me know if you have feedback in this area
@scuba.steve.0 in the vein of running over multiple tables, added a new chain in langchain which first selects relevant tables, then uses only those: https://langchain.readthedocs.io/en/latest/modules/chains/examples/sqlite.html#sqldatabasesequentialchain. should help in cases where you have a lot of tables and putting their schemas into context would be too large
@jerryjliu0 if gpt index inherits from the sqldb wrapper under the hood (whcih it sounds like it does? although maybe im misunderstanding?) is it possible to start using these chians in gptindex for free?
@hwchase17 re: the first point, your sqlalchemy wrapper already has an include_tables param right? that would probably be the initial place to start with, by manually specifying the set of tables. but being able to determine the set of tables is also a cool additional step
re "Using these chains in gpt index for free", this might be a better approach: https://github.com/jerryjliu/gpt_index/pull/236 - showcasing how you can use either gpt index or langchain to query the same underlying store
Add a reply
Sign up and join the conversation on Discord