Find answers from the community

Home
Members
gduck7959
g
gduck7959
Offline, last seen 3 months ago
Joined September 25, 2024
I am using CTransformers and trying to run it in the llamaindex pipeline [[response = qp.run(query_str="What is the correlation between survival and age?"]] but I am getting the following error:

AttributeError: 'CTransformers' object has no attribute 'set_callback_manager'

Can someone help me understand how to get a query pipeline working with my CTransformers local model?

import os
from langchain_community.llms import CTransformers
from llama_index.core import Settings


Set TRANSFORMERS_OFFLINE environment variable to 1

os.environ["TRANSFORMERS_OFFLINE"] = "1"

LLM_MODEL_NAME = ".cache/models/llama-2-7b-chat.Q5_K_M.gguf"
callback=[StreamingStdOutCallbackHandler()]

config = {'temperature': 0.0, 'context_length': 4096, 'stream': True}
llm = CTransformers(
model = LLM_MODEL_NAME,
model_type="gguf",
callbacks=callback,
config = config
)

Settings.llm = LangChainLLM(llm=llm)
14 comments
L
g
ReAct Agent Evaulation

I am following the below guides. I am first setting up the top_agent and then trying to build a rag_dataset with it. The problem is that it gets to a certain point and then gives me the below error message. Any ideas on how to resolve?

https://docs.llamaindex.ai/en/stable/examples/agent/multi_document_agents/
https://docs.llamaindex.ai/en/stable/examples/evaluation/answer_and_context_relevancy/


---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[17], line 1
----> 1 prediction_dataset = await qas.amake_predictions_with(
2 predictor=top_agent, batch_size=100, show_progress=True
3 )

File ~/anaconda3/envs/nat_gpu/lib/python3.11/site-packages/llama_index/core/agent/react/step.py:583, in ReActAgentWorker._arun_step(self, step, task)
579 reasoning_steps, is_done = await self._aprocess_actions(
580 task, tools, output=chat_response
581 )
582 task.extra_state["current_reasoning"].extend(reasoning_steps)
--> 583 agent_response = self._get_response(
584 task.extra_state["current_reasoning"], task.extra_state["sources"]
585 )
586 if is_done:
587 task.extra_state["new_memory"].put(
588 ChatMessage(content=agent_response.response, role=MessageRole.ASSISTANT)
589 )

File ~/anaconda3/envs/nat_gpu/lib/python3.11/site-packages/llama_index/core/agent/react/step.py:413, in ReActAgentWorker._get_response(self, current_reasoning, sources)
411 raise ValueError("No reasoning steps were taken.")
412 elif len(current_reasoning) == self._max_iterations:
--> 413 raise ValueError("Reached max iterations.")
415 if isinstance(current_reasoning[-1], ResponseReasoningStep):
416 response_step = cast(ResponseReasoningStep, current_reasoning[-1])

ValueError: Reached max iterations.
7 comments
g
L
Prompting with Mixtral: I am following the doc's for Evaluation and see the Default is like the below. Should I add [INST] to this prompt? Or am I supposed to modify the prompt when I set the llm? I am using vLLM (see below). Any help or ideas are appreciated. Thank you

DEFAULT_EVAL_TEMPLATE = PromptTemplate(
"Your task is to evaluate if the response is relevant to the query.\n"
"The evaluation should be performed in a step-by-step manner by answering the following questions:\n"
"1. Does the provided response match the subject matter of the user's query?\n"
"2. Does the provided response attempt to address the focus or perspective "
"on the subject matter taken on by the user's query?\n"
"Each question above is worth 1 point. Provide detailed feedback on response according to the criteria questions above "
"After your feedback provide a final result by strictly following this format: '[RESULT] followed by the integer number representing the total score assigned to the response'\n\n"
"Query: \n {query}\n"
"Response: \n {response}\n"
"Feedback:"
)

How I set up Mixtral....
local_llm = Vllm(
model="models/Mixtral-8x7B-Instruct-v0.1-GPTQ",
dtype="half",
tensor_parallel_size=2,
temperature=0,
max_new_tokens=250,
vllm_kwargs={
"swap_space": 1,
"gpu_memory_utilization": 0.70,
"max_model_len": 8000,
},
)
3 comments
g
L