Find answers from the community

Updated 2 months ago

Using a data warehouse eg BigQuery/Snowflake with LlamaIndex

Using a data warehouse eg BigQuery/Snowflake with LlamaIndex
R
7 comments
I've been experimenting with this and I've hit roadblocks in both cases. It's clear the Sqlalchemy dialects for each are not really suitable as they both require sqlalchemy v1.4 whereas LlamaIndex requires v2. I have had some positive results, even with this dependency issue (on both accounts) but I'm still experiencing LlamaIndex errors which are most likely a direct result of this incompatibility.
So: I would not at this point advise the use of Sqlalchemy/BigQuery/Snowflake. I wondered what conclusions other people have landed on?
Perhaps there's a previous release of LlamaIndex which supports sqlalchemy1. 4, but given the amazing pace of maintenance releases, new features etc, it is perhaps not a great premise to limit to an earlier release of LlamaIndex
Or perhaps we should consider whether we need to use BQ/Snowflake data sources at all. It's certainly possible to migrate to eg Postgres for some smaller datasets, but that won't be suitable for other use cases.
The features I feel are important:

  1. Being able to retrieve constraints from the DB (even if they're not enforced, as per BQ). This ensures the context contains these relationships which surely is key to the LLM's ability to join sensibly...
  1. Being able to obtain metadata eg column comments/descriptions. Again including this in context makes a lot of sense. Straightforward to achieve in BQ presently, not sure about Snowflake.
  1. As we're limited to Sqlalchemy v2, an sqlalchemy dialect that's mature!
I do realize that data warehouse systems are not perhaps the ideal data source for this sort of application, but given the simple sorts of SQL statements we're having the LLM build, I can't see what's wrong with trying.
So perhaps BQ and Snowflake aren't suitable right now... That might change. What mileage are you having? Are you having results with an alternative data warehouse, or is your experience limited to relation DBs like Postgres?
Does anyone have any thoughts/advice on this please?
Add a reply
Sign up and join the conversation on Discord