Yes, I believe so. I'm also working on something like this.
Right now, I believe it will require agents, pipelines, fine-tuning, and/or breaking the mold a bit. My plan is to,,,
- Use one LLM / agent setup decide if that is what the user wants (or would be helpful in the reply)
- A RAG like system to decide what dataset / type of code agent is needed
- Another (code specialist) LLM to write the SQL or Python
- A number of LLMs along the way to determine if the answer / query was correct, or to try again