AnandUgale

Trying to Create ReACT Agent using Mistral LLM using HuggingFace Inference API but getting error

HuggingFaceAPI

import os
from getpass import getpass
from huggingface_hub import login

HuggingFace API

HF_TOKEN = getpass()

login(token=HF_TOKEN)

Loading LLM endpoint

from llama_index.llms.huggingface import HuggingFaceInferenceAPI
llm = HuggingFaceInferenceAPI(model_name="mistralai/Mistral-7B-Instruct-v0.2", token=HF_TOKEN, num_output=512, is_function_calling_model=True, is_chat_model=True)

Simple Calculator

from llama_index.core.agent import ReActAgent
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool

Define Tool

def multiply(a: int, b: int) -> int:
"""Multiply two integers and returns the result integer"""
return a * b

def add(a: int, b: int) -> int:
"""Add two integers and returns the result integer"""
return a + b

def subtract(a: int, b: int) -> int:
"""Subtract two integers and returns the result integer"""
return a - b

multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)

Mistral

agent = ReActAgent.from_tools([multiply_tool, add_tool, subtract_tool], llm=llm, verbose=True)
response = agent.chat("What is 20+(2*4)? Calculate step by step.")

Error -
NotImplementedError: Messages passed in must be of odd length.

Seeking Advice on Optimizing Index Creation Time for Ingestion Pipeline

I'm developing an ingestion pipeline using llama_index to process Paul Graham's essays. Despite successful data loading and node creation, the indexing phase with VectorStoreIndex is extremely slow, taking over two hours and still running.

Setup Overview:

Data Loading:

from llama_index.core import SimpleDirectoryReader
reader = SimpleDirectoryReader(input_files=["data/paul_graham_essays.txt"])
docs = reader.load_data()

Embedding Configuration:
LLM: llama3:instruct from Ollama.
Embedding: BAAI/bge-small-en-v1.5 with batch size 50.

from llama_index.llms.ollama import Ollama
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
Settings.llm = Ollama(model="llama3:instruct", request_timeout=90.0)
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5", embed_batch_size=50)

Ingestion Pipeline:
Features: Sentence splitting and title extraction.

import nest_asyncio
nest_asyncio.apply()
from llama_index.core import IngestionPipeline, Document
pipeline = IngestionPipeline(transformations=[SentenceSplitter(chunk_size=25), TitleExtractor()])
nodes = pipeline.run(docs)

Index Creation:

from llama_index.storage import VectorStoreIndex
index = VectorStoreIndex(nodes)

Does anyone have insights on optimizing this indexing process to reduce time? Any advice or experiences with similar challenges would be greatly appreciated.

Find answers from the community

Trying to Create ReACT Agent using