NL2SQL for multiple queries

pprathamesh4585

Hello guys,
I am working on a project where user will ask natural language queries and this llama-index based engine will convert that natural language to sql query and execute it on my database and give answer in natural language to the user. Problem is it is only able to execute one query per question so comparison quetions are not possible to answer and also if a question does not require querying the database it will still query the database. How can I solve this. Please help me with your suggesting.
Thanks in advance.

15 comments

pprathamesh4585

@kapa.ai please answer if you can

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
I'm sorry, but your question is too vague. Could you please provide more details or context? I'm here to provide information about LlamaIndex.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

pprathamesh4585

@kapa.ai I want to execute multiple queries with natural language to sql query engine if user question requires multiple queries to be answerted

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
LlamaIndex provides a way to execute multiple sub-queries using the query engine. This is especially useful for compare/contrast queries across documents as well as queries pertaining to a specific document.

For example, if you have a SQL database and you want to use natural language queries that are synthesized into SQL queries, you can use the NLSQLTableQueryEngine. You need to specify the tables you want to use with this query engine. If you don't, the query engine will pull all the schema context, which could overflow the context window of the LLM.

Here is an example of how you can use it:

Plain Text

from llama_index.indices.struct_store import NLSQLTableQueryEngine

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["city_stats"],
)
query_str = "Which city has the highest population?"
response = query_engine.query(query_str)

This query engine should be used in any case where you can specify the tables you want to query over beforehand, or the total size of all the table schema plus the rest of the prompt fits your context window.

For more complex queries, you can use the SQLJoinQueryEngine which can handle natural language queries that require joining multiple tables in a SQL database.

For more information, you can refer to the LlamaIndex documentation () and examples ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

pprathamesh4585

@kapa.ai but this will still now allowing me execute multiple queries. Please tell me how

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
To execute multiple queries, you can use the SubQuestionQueryEngine class in LlamaIndex. This class allows you to generate a "query plan" containing sub-queries against sub-documents before synthesizing the final answer.

Here is an example of how you can use it:

First, define an index for each document/data source, and wrap it with a QueryEngineTool:

Plain Text

from llama_index.tools import QueryEngineTool, ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine=sept_engine,
        metadata=ToolMetadata(
            name="sept_22",
            description="Provides information about Uber quarterly financials ending September 2022",
        ),
    ),
    QueryEngineTool(
        query_engine=june_engine,
        metadata=ToolMetadata(
            name="june_22",
            description="Provides information about Uber quarterly financials ending June 2022",
        ),
    ),
    QueryEngineTool(
        query_engine=march_engine,
        metadata=ToolMetadata(
            name="march_22",
            description="Provides information about Uber quarterly financials ending March 2022",
        ),
    ),
]

Then, define a SubQuestionQueryEngine over these tools:

Plain Text

from llama_index.query_engine import SubQuestionQueryEngine

query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools
)

This will allow you to execute multiple queries against different data sources. For more information, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

pprathamesh4585

@kapa.ai but not just queries. I want to execute multiple sql queries

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
To execute multiple SQL queries using natural language with LlamaIndex, you can use the SQLTableRetrieverQueryEngine. This engine retrieves the relevant table schema and synthesizes a SQL query and a response from the results of that query.

Here is an example of how you can use it:

Plain Text

from llama_index.indices.struct_store import SQLTableRetrieverQueryEngine

query_engine = SQLTableRetrieverQueryEngine(
    sql_database, obj_index.as_retriever(similarity_top_k=1)
)
response = query_engine.query("Which city has the highest population?")
print(response)

However, the provided extracts do not contain information on how to execute multiple SQL queries at once. You might need to execute each query separately and then combine the results as needed.

For more information, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

pprathamesh4585

hello @Logan M please help me here If you can. Ignore whatever kapa.ai said. Just look at my question please. Your help is much appreciated.

LLogan M

You can give the text2sql query engine to a sub-question query engine or an agent

pprathamesh4585

@Logan M Thank you very much for help. I have one more question. sorry for bothering you.

While using SubQuestionQueryEngine with sql_engine. I want to give context on how to join tables and what user means by general words like - performance. I also want to pass what enums mean sql tables I am using.
Before using SubQuestionQueryEngine I was doing this with another instance of OpenAI class which would generate context based on user question.
But with SubQuestionQueryEngine even though I pass context in prompt its forgetting it and now making relevant sub-questions. how can I takle this?

Thanks a lot for your help really appriciate it.

Add a reply

Find answers from the community

NL2SQL for multiple queries