How can I change the standard system and other prompts ...

At a glance

How can I change the standard system and other prompts of chatmode condense_plus_question?

23 comments

Please stand by @fuckAllTechCompanies as I search the knowledge base. I normally take about 20 seconds or so. :wave:

----------------------
To change the standard system and other prompts of the condense_plus_context chat mode, you can use the from_defaults method of the CondensePlusContextChatEngine class. This method allows you to specify custom prompts when initializing the chat engine.

Here is an example of how you can do it:

Plain Text

from llama_index.core.chat_engine import CondensePlusContextChatEngine
from llama_index.core.base.base_query_engine import BaseQueryEngine
from llama_index.core.prompts import BasePromptTemplate
from llama_index.core.memory.types import BaseMemory, ChatMemoryBuffer
from llama_index.core.llms.llm import LLM
from llama_index.core.base.llms.types import ChatMessage
from llama_index.core.service_context import ServiceContext

# Initialize your custom prompts
context_prompt = "Your custom context prompt"
condense_prompt = "Your custom condense prompt"

# Initialize the chat engine with custom prompts
chat_engine = CondensePlusContextChatEngine.from_defaults(
    query_engine=BaseQueryEngine(),
    context_prompt=context_prompt,
    condense_prompt=condense_prompt,
    chat_history=[ChatMessage()],
    memory=BaseMemory(),
    memory_cls=ChatMemoryBuffer,
    service_context=ServiceContext(),
    verbose=True,
    system_prompt="Your custom system prompt",
    llm=LLM()
)

Please note that you need to replace the placeholders in the example with your actual implementations of BaseQueryEngine, BaseMemory, ChatMemoryBuffer, ServiceContext, and LLM.

For more details, you can refer to the CondensePlusContextChatEngine documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ffuckAllTechCompanies

@kapa.ai Do I have to take care of the special tokens like EOT in the prompts?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
No, you don't have to manually add special tokens like EOT (End Of Text) in the prompts. The LlamaIndex library handles the addition of these special tokens internally.

When you provide your custom prompts, you should focus on the content that you want the model to consider. The library will take care of formatting the prompts correctly for the underlying language model.

For more details, you can refer to the CondensePlusContextChatEngine documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ffuckAllTechCompanies

@kapa.ai Which chat mode should I use if I want to create a chatboot for searching information in a bunch of pdfs?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
If you want to create a chatbot for searching information in a bunch of PDFs, you can use the best chat mode. This mode uses the openai mode if you are using an OpenAI model that supports the latest function calling API, otherwise, it uses the react mode.

Here is an example of how you can set up the chat engine with the best mode:

Plain Text

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4")
data = SimpleDirectoryReader(input_dir="./data/your_pdf_directory/").load_data()
index = VectorStoreIndex.from_documents(data)

chat_engine = index.as_chat_engine(chat_mode="best", llm=llm, verbose=True)

Please replace "./data/your_pdf_directory/" with the path to your directory containing the PDF data.

However, as mentioned in the LlamaIndex documentation (), the standard RAG stack (top-k retrieval + basic text splitting) might not perform well as the number of documents scales up. In such cases, you may want to use structured information to help with more precise retrieval.

For more details, you can refer to the Chat Engine - Best Mode documentation () and the Building Performant RAG Applications for Production documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ffuckAllTechCompanies

@kapa.ai I use a llama model should I still use best and why?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
Yes, you can still use the best chat mode even if you're using a Llama model. The best chat mode in LlamaIndex is designed to select the best chat engine based on the current LLM (Language Learning Model).

According to the LlamaIndex documentation (), the best chat mode corresponds to OpenAIAgent if you're using an OpenAI model that supports function calling API, otherwise, it corresponds to ReActAgent.

So, if your Llama model supports the function calling API, the best chat mode will use the OpenAIAgent. If not, it will use the ReActAgent. This ensures that you're always using the most suitable chat engine for your model, which can help to optimize the performance of your chatbot.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ffuckAllTechCompanies

@kapa.ai What is a React agent?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
A ReAct agent is a framework that integrates the reasoning capabilities of Language Learning Models (LLMs) with the ability to take actionable steps. It's a part of the LlamaIndex project and is designed to create a more sophisticated system that can understand and process information, evaluate situations, take appropriate actions, communicate responses, and track ongoing situations.

ReAct is a technique that enables LLMs to reason and perform task-specific actions. It combines chain-of-thought reasoning with action planning, allowing LLMs to create reasoning tracks and task-specific actions, and strengthening the synergy between them using memory.

Key components of a ReAct agent include:

Reasoning Loop: This allows data agents to select and interact with tools in response to an input task.
Memory: LLMs, with access to memory, can store and retrieve data, which is ideal for apps tracking state or accessing multiple sources. Memory retains past interactions, enabling seamless reference to earlier conversation points. This integration involves allocating memory slots for relevant information and leveraging retrieval mechanisms during conversations. By recalling stored data, LLMs enhance contextual responses and integrate external sources, enriching user experiences.

For more details, you can refer to the ReAct Agent documentation () and the ReAct Agent API reference ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ffuckAllTechCompanies

@kapa.ai How would I do change the prompts if I use chat_engine = index.as_chat_engine()?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
If you're using index.as_chat_engine(), you can change the prompts by passing them as arguments to the function. Here's an example:

Plain Text

from llama_index.core import PromptTemplate

# Define your custom prompts
custom_prompt = PromptTemplate(
    """\
Given a conversation (between Human and Assistant) and a follow up message from Human, \
rewrite the message to be a standalone question that captures all relevant context \
from the conversation.

<Chat History>
{chat_history}

<Follow Up Message>
{question}

<Standalone question>
"""
)

# Use the custom prompt when creating the chat engine
chat_engine = index.as_chat_engine(condense_question_prompt=custom_prompt)

In this example, custom_prompt is a PromptTemplate object that defines the new prompt. You can replace the string inside PromptTemplate with your own custom prompt.

Please note that the exact arguments you need to pass to as_chat_engine() depend on the specific chat engine you're using. For more details, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ffuckAllTechCompanies

@kapa.ai Why do I have to use prompttemplate in this case in the case of the other chatengine just the from_defaults()?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
The PromptTemplate is used in the index.as_chat_engine() method to define the structure of the prompts that will be used in the chat engine. It provides a way to dynamically generate prompts based on the current state of the conversation.

On the other hand, the from_defaults() method is a class method that is used to create an instance of a chat engine with default parameters. It allows you to customize the chat engine by passing in your own parameters, such as a custom prompt, but it doesn't require a PromptTemplate.

The reason for this difference is that index.as_chat_engine() is a higher-level function that is designed to make it easy to create a chat engine with a specific configuration. It uses PromptTemplate to provide a flexible way to define the prompts that will be used in the chat engine.

In contrast, from_defaults() is a lower-level method that provides more control over the creation of the chat engine, but it requires you to manually specify all of the parameters.

For more details, you can refer to the LlamaIndex documentation () and the PromptTemplate documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

ffuckAllTechCompanies

@kapa.ai How do I set a custom system message?

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
To set a custom system message, you can use the get_system_message() method in the Neo4jQueryToolSpec class if you're using the Neo4j tool, or you can define a ChatMessage with the role set to MessageRole.SYSTEM if you're creating a chat prompt template.

Here's an example of setting a custom system message with the Neo4jQueryToolSpec class:

Plain Text

from llama_index.tools.neo4j import Neo4jQueryToolSpec

class CustomNeo4jQueryToolSpec(Neo4jQueryToolSpec):
    def get_system_message(self):
        return "Your custom system message"

tool_spec = CustomNeo4jQueryToolSpec()
system_message = tool_spec.get_system_message()

In this example, CustomNeo4jQueryToolSpec is a subclass of Neo4jQueryToolSpec that overrides the get_system_message() method to return a custom system message.

And here's an example of setting a custom system message with a ChatMessage:

Plain Text

from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core import ChatPromptTemplate

chat_messages = [
    ChatMessage(
        role=MessageRole.SYSTEM,
        content="Your custom system message",
    ),
    ChatMessage(role=MessageRole.USER, content="Your user message"),
]

chat_prompt_template = ChatPromptTemplate(chat_messages)

In this example, the first ChatMessage has the role set to MessageRole.SYSTEM, which means it will be used as the system message.

For more details, you can refer to the Neo4jQueryToolSpec documentation () and the ChatPromptTemplate documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Find answers from the community

How can I change the standard system and other prompts of chatmode condense_plus_question?