Find answers from the community

Updated 2 weeks ago

Implementing Function Calling Program with Structured Extraction

I'm trying to implement FunctionCallingProgram for structured extraction using vLLM, Qwen-2.5, and Pydantic. However, I quickly get this error:
ValueError: Model name Qwen/Qwen2.5-3B does not support function calling API.
s
L
8 comments
Qwen2.5 definitely supports function calling—with vLLM, no less. https://qwen.readthedocs.io/en/latest/framework/function_call.html
Plain Text
llm = VllmServer(
    model='Qwen/Qwen2.5-3B',
    api_url='http://localhost:8000/v1',
    tensor_parallel_size=4,
    max_new_tokens=256,
    temperature=0.0,
    dtype='bfloat16',
    vllm_kwargs={
        'max_model_len': 32_750,
    },
)

program = FunctionCallingProgram.from_defaults(
    output_cls=output_class,
    prompt_template_str=prompt,
    llm=llm,
    verbose=True,
)

program(markdown)
Ohhh, maybe I need a vLLM command-line parameters similar to --enable-auto-tool-choice --tool-call-parser hermes?
Nope. No dice.
The VllmServer class does not implement any function calling handling
Launch it in openai mode, use OpenAILike
Plain Text
pip install OpenAILike


Plain Text
from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(
  model="Qwen/Qwen2.5-3B",
  api_base="http://localhost:8000/v1",
  api_key="fake",
  max_tokens=256,
  temperature=0.0,
  context_window=32750,
  is_chat_model=True,
  is_function_calling_model=True,
  additional_kwargs={"max_model_len": 32750}
)
Ah, thank you!! 😃 👍
Add a reply
Sign up and join the conversation on Discord