Find answers from the community

Updated 2 months ago

Agent

Plain Text
return OpenAIAgent.from_tools(
        tools=[query_engine_tool],
        llm=get_default_llm(),
        chat_history=history,
    )

If i define the LLM here in the tool, will it use it for the reasoning, but i can use another lllm for the final output?

I find coding models and gpt4 do great at the tool usage and such, but sometimes i want to have the final generation done by a different model
L
j
17 comments
The llm defined here is responsible for figuring out which tools to use, interpreting tool output, responding to the user

What llm each tool uses depends on how you defined it. For example, here the query engine tool will use the llm attached to that index
Hope that makes some sense lol
Plain Text
 from llama_index.llms import Replicate

llama2_7b_chat = "meta/llama-2-7b-chat:8e6975e5ed6174911a6ff3d60540dfd4844201974602551e10e9e87ab143d81e"
llm = Replicate(
    model=llama2_7b_chat,
    temperature=0.01,
    additional_kwargs={"top_p": 1, "max_new_tokens": 300},
)


# set tokenizer to match LLM
from llama_index import set_global_tokenizer
from transformers import AutoTokenizer

set_global_tokenizer(AutoTokenizer.from_pretrained("NousResearch/Llama-2-7b-chat-hf").encode)

from llama_index.embeddings import HuggingFaceEmbedding
from llama_index import ServiceContext

embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)

from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents, service_context=service_context)

repair_job_engine = index.as_query_engine(similarity_top_k=3)

from llama_index.tools import QueryEngineTool
from llama_index.tools import ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine=repair_job_engine,
        metadata=ToolMetadata(
            name="query_specified_repair_jobs",
            description="Provides information about repair jobs in detail "
        )
    )
]


from llama_index.agent import ReActAgent
agent = ReActAgent.from_tools(
    query_engine_tools,
    llm=llm,
    verbose=True,
    # context=context

response = agent.chat("What are the main failure modes in the file? ")
print(str(response))
)


I get:
Thought: I need to use a tool to help me answer the question.
Action: tool
Action Input: {'input': 'hello world', 'num_beams': 5}
KeyError: 'tool'

I think I strictly follow the guide, and try to use agent to answer some questions in the file loaded. I also specify the tool for the ReActAgent, but it seems the chat function doesn't recognize the tool I specified. Can someone help?
open source LLMs are very bad at agentic tasks. Here, you can see it's completely hallucinating the tool name and input
Even Llama2 is almost the best open source llm, it’s not capable of making this work?
llama2 isn't actually that great -- but in any case, yea, agent tasks are very hard for open-source LLMs

Look at the performance gap on benchmarks
Attachment
image.png
The gap between gpt-4 and everything else is kind of wild
Yes, I do see the gap. And I’m fine on the content quality performance. But correct me if I’m wrong, I think the error I have here is more of a functional bug for llama index agent function to correctly find the existence of the Tool I defined and as the explicit input of that index agent. this is an keyError that as it shows
The key error is because the LLM completely hallucinated. There is no tool called "tool", and the input it generated {'input': 'hello world', 'num_beams': 5} does not seem helpful or correct given the input query
Your tool name was query_specified_repair_jobs
A correct react output might look like

Plain Text
Thought: I need to use a tool to help me answer the question.
Action: query_specified_repair_jobs
Action Input: {'input': 'What are the main failure modes for the repair jobs?'}
Yes, it should look like this
Is anything else I can try? If it doesn’t work this way, how did those people test on the agent performance for the agentBench? This task should be a very common use case for many people who don’t want to use OpenAI
The performance on agent bench is likely as low as it is because of errors just like this.

My suggestion is use a newer model (zephyr, mistral). You can also try changing the react prompts, but tbh it's quite tricky
I've been doing this for months now and trust me, using open source models for agents is just not ready yet. Pretty unreliable right now.

I'd be surprised if anyone is using open-source models for agents in production. And if they are, they are likely fine-tuned by the company and/or using highly custom code
Alright, thank you @Logan M for the good advice! Will play around other models.
Add a reply
Sign up and join the conversation on Discord