Find answers from the community

Updated last year

what does the CallbackManager([finetuning_handler]) do

At a glance
@kapa.ai what does the CallbackManager([finetuning_handler]) do
k
D
10 comments
Please stand by @DangFutures as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
The CallbackManager in LlamaIndex is a tool that allows you to add as many callbacks as needed for debugging, tracking, and tracing the inner workings of the library. It logs data related to events, tracks the duration and number of occurrences of each event, and records a trace map of events.

When you initialize CallbackManager with finetuning_handler like CallbackManager([finetuning_handler]), it means that you're setting up a callback manager with a specific handler for fine-tuning. This handler will be used to record and manage events related to the fine-tuning process.

For example, in the context of fine-tuning with OpenAI, the OpenAIFineTuningHandler records all LLM inputs and outputs and provides a function save_finetuning_events() to save inputs and outputs in a format suitable for fine-tuning with OpenAI.

Here is an example of how it's used in the code:

Plain Text
from llama_index.callbacks import CallbackManager, OpenAIFineTuningHandler
from llama_index import ServiceContext
from llama_index.llms import OpenAI

finetuning_handler = OpenAIFineTuningHandler()
callback_manager = CallbackManager([finetuning_handler])

gpt_4_context = ServiceContext.from_defaults(
    llm=OpenAI(model="gpt-4-0613", temperature=0.3),
    callback_manager=callback_manager,
    chunk_size=chunk_size,
)


In this example, the CallbackManager is initialized with the OpenAIFineTuningHandler and then used in the ServiceContext for the GPT-4 model.
Sources:
  • GitHub - jerryjliu/llama_index ()
  • LlamaIndex Documentation ()
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
what other handlers can i sue @kapa.ai for agents
Let me check the knowledge base for you. One moment! :books:
----------------------
Based on the provided extracts, there isn't specific information about other handlers that can be used for agents in LlamaIndex. The extracts mention the use of CustomSimpleAgentWorker for building custom agents and the use of AgentRunner for orchestrating the agent's tasks, but they do not provide details about other specific handlers that can be used with agents.

Here's an example of how to use CustomSimpleAgentWorker and AgentRunner:

Plain Text
from llama_index.agent import AgentRunner
from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-4")
callback_manager = llm.callback_manager

query_engine_tools = [sql_tool] + vector_tools
agent_worker = RetryAgentWorker.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
    callback_manager=callback_manager,
)
agent = AgentRunner(agent_worker, callback_manager=callback_manager)

(Source: Building a Custom Agent ())

For more detailed information about other handlers that can be used with agents in LlamaIndex, I recommend referring to the LlamaIndex Documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai write me a custom function tool that would be good for dataset generationg
Let me check the knowledge base for you. One moment! :books:
----------------------
Based on the provided extracts, here's an example of how you can create a custom function tool for dataset generation using LlamaIndex's DatasetGenerator:

Plain Text
from llama_index.evaluation import DatasetGenerator, QueryResponseDataset
from llama_index import ServiceContext
from llama_index.llms import OpenAI

def generate_custom_dataset(nodes, num_questions_per_chunk=3, num=60, filename="data/custom_eval_qr_dataset.json"):
    # Initialize the service context with the desired LLM
    eval_service_context = ServiceContext.from_defaults(
        llm=OpenAI(model="gpt-4-1106-preview")
    )

    # Initialize the dataset generator with the nodes and service context
    dataset_generator = DatasetGenerator(
        nodes,
        service_context=eval_service_context,
        show_progress=True,
        num_questions_per_chunk=num_questions_per_chunk,
    )

    # Generate the dataset
    eval_dataset = await dataset_generator.agenerate_dataset_from_nodes(num=num)

    # Save the dataset to a JSON file
    eval_dataset.save_json(filename)

    return eval_dataset


This function, generate_custom_dataset, takes as input a list of nodes, the number of questions per chunk, the total number of questions, and a filename. It initializes a ServiceContext with the OpenAI model "gpt-4-1106-preview", creates a DatasetGenerator with the provided nodes and service context, generates a dataset, and saves it to a JSON file.

Please note that this function is asynchronous because it uses the await keyword, so it must be called from an asynchronous context.
This function is based on the code provided in the Benchmarking OpenAI Retrieval API (through Assistant Agent) () example from the LlamaIndex documentation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Add a reply
Sign up and join the conversation on Discord