If I am creating a RAG application, is

At a glance

The community member is creating a RAG (Retrieval-Augmented Generation) application and is seeking guidance on how to turn a prompt into a better query for searching over unstructured documents in a vector database. The questions are reasoning-based, and the documents are of different types.

In the comments, another community member suggests using an LLM (Large Language Model) to rewrite the query, providing an example code snippet. They also mention using a ReAct agent and a process of generating questions based on a data model, having an LLM answer those questions, and hydrating the model with the data.

There is no explicitly marked answer in the provided information.

aatr

If I am creating a RAG application, is there either a framework or good guidance on taking a prompt we've got now and turning it into a better query (for embeddings) to search over the documents that are in the vector DB? The questions are reasoning based and the documents very unstructured and all different types. Reasong based example: "What is an unmet need that is trying to be solved?" I had read decomposable query but not sure if that'd work for the application.

2 comments

LLogan M

I think any step to get the LLM to rewrite the query makes sense. Its as simple as prompting the LLM with some additional context and asking it to rewrite

Then you can run retrieval with the rewritten query, and respond using the original

Something like

Plain Text

from llama_index.core import get_response_synthesizer

retriever = index.as_retriever(...)
synthesizer = get_response_synthesizer(llm=llm) 

rewritten = str(llm.complete(f"Hey, I have data about X, rewrite the following query to make it more useful for semantic search: {user_query}"))

nodes = retriever.retrieve(rewritten)

response = synthesizer.synthesize(user_query, nodes=nodes)

JJasonV

The decomposable query seems reasonable. How about using a ReAct agent?

I've done something similar manually. I defined an arbitrary data model using SQLModel, then walked through the model to have the LLM auto-populate instances using a vector store index. It generates JSON schema-compliant results stored in Postgres, essentially acting as an LLM-based database generator. You provide the data model and input data, and it generates an SQL database. Pretty cool.

Since I don't know the fields at runtime, I inspect the model and use field descriptions to ask the LLM to generate questions based on the schema. I then have another LLM answer those questions with my context.

Here's the base prompt I'm using:

Plain Text

base_prompt = (
    "Your ROLE is a senior research analyst on a team providing detailed information about businesses. "
    "You will be given the business's name, a specific field name, and a JSON schema containing a description of that field "
    "and all relevant sub-fields for comprehensive understanding.\n"
    "Your TASK is to generate a thorough question that fully elucidates the field. Please adhere to the following guidelines: \n"
    " - Include all subfields in a single, comprehensive question.\n"
    " - Exclude unique identifier fields like 'id'.\n"
    " - Use examples in the schema to craft precise and informative questions.\n"
    " - Do not reference database terminology.\n"
    "Emit only the question without any conversational elements or prefaces.\n"
)

I also have a QA agent check if the generated answer satisfies the question. If it does, the model is hydrated with the data. If not, the failed question/answer pair is stored, and the question generator tries again. There's even a backup strategy to reduce strictness and increase recall on failed attempts.

Add a reply

Find answers from the community

If I am creating a RAG application, is