Find answers from the community

Updated last month

Implementing Function Calling Program with Structured Extraction

At a glance

I'm trying to implement FunctionCallingProgram for structured extraction using vLLM, Qwen-2.5, and Pydantic. However, I quickly get this error:
ValueError: Model name Qwen/Qwen2.5-3B does not support function calling API.

8 comments

ssodeep

Qwen2.5 definitely supports function calling—with vLLM, no less. https://qwen.readthedocs.io/en/latest/framework/function_call.html

ssodeep

Plain Text

llm = VllmServer(
    model='Qwen/Qwen2.5-3B',
    api_url='http://localhost:8000/v1',
    tensor_parallel_size=4,
    max_new_tokens=256,
    temperature=0.0,
    dtype='bfloat16',
    vllm_kwargs={
        'max_model_len': 32_750,
    },
)

program = FunctionCallingProgram.from_defaults(
    output_cls=output_class,
    prompt_template_str=prompt,
    llm=llm,
    verbose=True,
)

program(markdown)

ssodeep

Ohhh, maybe I need a vLLM command-line parameters similar to --enable-auto-tool-choice --tool-call-parser hermes?

ssodeep

Nope. No dice.

LLogan M

The VllmServer class does not implement any function calling handling

LLogan M

Launch it in openai mode, use OpenAILike

LLogan M

Plain Text

pip install OpenAILike

Plain Text

from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(
  model="Qwen/Qwen2.5-3B",
  api_base="http://localhost:8000/v1",
  api_key="fake",
  max_tokens=256,
  temperature=0.0,
  context_window=32750,
  is_chat_model=True,
  is_function_calling_model=True,
  additional_kwargs={"max_model_len": 32750}
)

ssodeep

Ah, thank you!! 😃 👍

Add a reply