Can someone explain the different

At a glance

The community members are discussing the difference between setting the text_qa_template in a response synthesizer and the system_prompt in the LLM used in the response synthesizer. They are trying to achieve the behavior of telling the LLM what they are and then setting an example prompt and example answer. The community members are using the CitationQueryEngine to get the citations for the query.

The community members discuss how to set the system prompt, either by using text_qa_messages or by directly setting the system_prompt in the LLM. They note that the first method of defining the template allows for more flexibility in customizing the overall prompt. The community members also discuss that there is no difference between the two methods if the only concern is the system prompt.

Additionally, the community members discuss that the CitationQueryEngine breaks nodes into smaller citable chunks and asks the LLM to write in-text citations, while other engines also return the source nodes of the query.

Useful resources

IItamar

Can someone explain the different between setting the text_qa_template in a response synthesizer and the system_prompt in the LLM used in the response synthesizer?

The behavior I'm looking to achieve is to tell my LLM what they are and then set an example prompt and example answer. I am using the CitationQueryEngine because I also want to know what the citations for the query are.

12 comments

LLogan M

Which LLM are you using? Where do you see system prompt?

The citation query engine uses a custom text_qa_tempalte. When you call CitationQueryEngine.from_defaults() it sets the text qa template to be this https://github.com/jerryjliu/llama_index/blob/ae3e0bb5ca7811e579da39bbfac8c217dc818cfc/llama_index/query_engine/citation_query_engine.py#L22

You could override that to have different instructions, or add a system prompt there using a chat template

Plain Text

from llama_index.llms.base import ChatMessage
from llama_index.prompts.base import ChatPromptTemplate

text_qa_messages = [
    ChatMessage(role="system", content="Some system prompt"),
    ChatMessage(
        content="some template string. A good default is the one I linked above in the code base",
        role=MessageRole.USER,
    ),
]

text_qa_template = ChatPromptTemplate(message_templates=text_qa_messages)

IItamar

If you look at the documentation here you can see how to add a system prompt to a huggingface LLM for example: https://gpt-index.readthedocs.io/en/stable/core_modules/model_modules/llms/usage_custom.html#example-using-a-huggingface-llm

IItamar

I can either do this or use text_qa_messages. But I dont know what is the difference

IItamar

For more context this is sort of what my code looks like:

Plain Text

self.model = OpenAI(
                model=self.modelName,
                temperature=self.temperature,
                max_tokens=self.contextBuffer,
                stream=True,
            ) # WHY NOT SET SYSTEM PROMPT HERE?????

self.serviceContext = ServiceContext.from_defaults(
            llm=self.model, embed_model=self.embedModel,
        )

self.index = VectorStoreIndex.from_vector_store(
                service_context=self.serviceContext,
                vector_store=WeaviateVectorStore(
                    weaviate_client=self.client, index_name="index_name"
                )
            )

self.queryEngine = CitationQueryEngine.from_args(
            self.index,
            streaming=self.streaming,
            citation_qa_template=ChatPromptTemplate(
                message_templates=self.chatHistory,
            ), # VS USE CITATION_QA_TEMPLATE HERE?????
            service_context=self.serviceContext,
            similarity_top_k=self.topK,
            citation_chunk_size=self.citationSize,
        )

LLogan M

Yea, thats specific to huggingface llm 😅 It's a little unreliable to do it this way (it can cause token issues), but it's needed for open-source LLMs

For openai, use the way above tbh
OR, slightly sneakier, just remember this got added.

Plain Text

from llama_index import LLMPredictor

llm_predictor = LLMPredictor(llm=self.model, system_prompt="Talk like a pirate")

service_context = service_context.from_defaults(llm_predictor=llm_predictor , embed_model=self.embedModel,

IItamar

Whats "better" though? Is there a difference?

IItamar

Thanks btw!

LLogan M

No difference, the first method of defining the template just allows for more flexibility (you can customize every part of the overall prompt)

LLogan M

If you are just worried about the system prompt, they should be the same 👀

IItamar

Thanks!

IItamar

Last question - does the CitationQueryEngine do anything special? Do the other engines also return the source nodes of the query?

LLogan M

The only thing special the citation query engine does is breaks nodes into smaller citable chunks, and asks the LLM to write in text citations

All other engines also return source nodes though 👍

Add a reply

Find answers from the community

Can someone explain the different