The features I feel are important:
- Being able to retrieve constraints from the DB (even if they're not enforced, as per BQ). This ensures the context contains these relationships which surely is key to the LLM's ability to join sensibly...
- Being able to obtain metadata eg column comments/descriptions. Again including this in context makes a lot of sense. Straightforward to achieve in BQ presently, not sure about Snowflake.
- As we're limited to Sqlalchemy v2, an sqlalchemy dialect that's mature!
I do realize that data warehouse systems are not perhaps the ideal data source for this sort of application, but given the simple sorts of SQL statements we're having the LLM build, I can't see what's wrong with trying.