CallbackManager
in LlamaIndex is a tool that allows you to add as many callbacks as needed for debugging, tracking, and tracing the inner workings of the library. It logs data related to events, tracks the duration and number of occurrences of each event, and records a trace map of events.CallbackManager
with finetuning_handler
like CallbackManager([finetuning_handler])
, it means that you're setting up a callback manager with a specific handler for fine-tuning. This handler will be used to record and manage events related to the fine-tuning process.OpenAIFineTuningHandler
records all LLM inputs and outputs and provides a function save_finetuning_events()
to save inputs and outputs in a format suitable for fine-tuning with OpenAI.from llama_index.callbacks import CallbackManager, OpenAIFineTuningHandler from llama_index import ServiceContext from llama_index.llms import OpenAI finetuning_handler = OpenAIFineTuningHandler() callback_manager = CallbackManager([finetuning_handler]) gpt_4_context = ServiceContext.from_defaults( llm=OpenAI(model="gpt-4-0613", temperature=0.3), callback_manager=callback_manager, chunk_size=chunk_size, )
CallbackManager
is initialized with the OpenAIFineTuningHandler
and then used in the ServiceContext
for the GPT-4 model.CustomSimpleAgentWorker
for building custom agents and the use of AgentRunner
for orchestrating the agent's tasks, but they do not provide details about other specific handlers that can be used with agents.CustomSimpleAgentWorker
and AgentRunner
:from llama_index.agent import AgentRunner from llama_index.llms import OpenAI llm = OpenAI(model="gpt-4") callback_manager = llm.callback_manager query_engine_tools = [sql_tool] + vector_tools agent_worker = RetryAgentWorker.from_tools( query_engine_tools, llm=llm, verbose=True, callback_manager=callback_manager, ) agent = AgentRunner(agent_worker, callback_manager=callback_manager)
DatasetGenerator
:from llama_index.evaluation import DatasetGenerator, QueryResponseDataset from llama_index import ServiceContext from llama_index.llms import OpenAI def generate_custom_dataset(nodes, num_questions_per_chunk=3, num=60, filename="data/custom_eval_qr_dataset.json"): # Initialize the service context with the desired LLM eval_service_context = ServiceContext.from_defaults( llm=OpenAI(model="gpt-4-1106-preview") ) # Initialize the dataset generator with the nodes and service context dataset_generator = DatasetGenerator( nodes, service_context=eval_service_context, show_progress=True, num_questions_per_chunk=num_questions_per_chunk, ) # Generate the dataset eval_dataset = await dataset_generator.agenerate_dataset_from_nodes(num=num) # Save the dataset to a JSON file eval_dataset.save_json(filename) return eval_dataset
generate_custom_dataset
, takes as input a list of nodes, the number of questions per chunk, the total number of questions, and a filename. It initializes a ServiceContext
with the OpenAI model "gpt-4-1106-preview", creates a DatasetGenerator
with the provided nodes and service context, generates a dataset, and saves it to a JSON file.await
keyword, so it must be called from an asynchronous context.