how to use llama CPP server in llamaindex . I launched ...

At a glance

A community member asks how to use a LlamaCPP server with LlamaIndex, specifically focusing on using an OpenAPI-compatible server. The discussion evolves through several approaches and troubleshooting steps.

Initially, guidance is provided on using the LlamaCPP class with appropriate parameters and model URLs. When clarified that the focus is on OpenAPI compatibility, the discussion shifts to using OpenAPIToolSpec and RequestsToolSpec classes for integration.

When the community member attempts to use the OpenAILike library with a localhost IP for api_base, they encounter validation errors. The solution involves proper initialization of the OpenAILike class with model name, API base URL, and API key parameters.

Finally, when facing a "subclass of Base component expected" error, it's explained that this occurs when a class doesn't properly inherit from BaseComponent. The solution is to ensure any custom components inherit from BaseComponent or its subclasses, such as QueryComponent, BaseLLMComponent, LLMChatComponent

Useful resources

TTech explorer

@kapa.ai how to use llama CPP server in llamaindex . I launched openapi compatible llamacpp server. How can I use this url

11 comments

kkapa.ai

Please stand by @Tech explorer as I search the knowledge base. I normally take about 20 seconds or so. :wave:

kkapa.ai

----------------------
To use the LlamaCPP server with LlamaIndex, you need to initialize the LlamaCPP class with the appropriate parameters. Here is an example of how to do it:

Plain Text

from llama_index.llms.llama_cpp import LlamaCPP

model_url = "<https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q4_0.bin">

llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    model_url=model_url,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=None,
    temperature=0.1,
    max_new_tokens=256,
    # llama2 has a context window of 4096 tokens, but we set it lower to allow for some wiggle room
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": 1},
    verbose=True,
)

In this example, model_url is the URL to the GGML model that will be downloaded automatically. You can replace this with the URL of your OpenAPI compatible LlamaCPP server.

For more details, you can refer to the LlamaIndex documentation () and the GitHub example ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

TTech explorer

@kapa.ai but my question is how to use openapi compatible server in llamaindex

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
To use an OpenAPI compatible server with LlamaIndex, you can use the OpenAPIToolSpec and RequestsToolSpec classes. Here is an example of how to do it:

Plain Text

from llama_index.tools.openapi.base import OpenAPIToolSpec
from llama_index.tools.requests.base import RequestsToolSpec
from llama_index.tools.tool_spec.load_and_search.base import LoadAndSearchToolSpec

# Load the OpenAPI spec for your server
open_api_spec = ...  # Load your OpenAPI spec here

open_spec = OpenAPIToolSpec(open_api_spec)

requests_spec = RequestsToolSpec(
    {
        "your-server-url": {
            "Authorization": "Bearer your-key",
            "Content-Type": "application/json",
        }
    }
)

# OpenAPI spec is too large for content, wrap the tool to separate loading and searching
wrapped_tools = LoadAndSearchToolSpec.from_defaults(
    open_spec.to_tool_list()[0],
).to_tool_list()

agent = OpenAIAgent.from_tools(
    [*wrapped_tools, *requests_spec.to_tool_list()], verbose=True
)

In this example, replace "your-server-url" with the URL of your OpenAPI compatible server and "Bearer your-key" with the appropriate authorization for your server.

For more details, you can refer to the LlamaIndex documentation () and the GitHub example ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

TTech explorer

@kapa.ai iam using openailike library and passing local host IP for api_base but it's giving validation error for data source

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
The OpenAILike class in LlamaIndex is designed to be a thin wrapper around the OpenAI model, making it compatible with 3rd party tools that provide an OpenAI-compatible API. If you're getting a validation error, it might be due to incorrect setup or usage.

Here's a basic example of how to use OpenAILike:

Plain Text

from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(model="my model", api_base="<http://localhost:port",> api_key="fake")

response = llm.complete("Hello World!")
print(str(response))

In this example, replace "my model" with the name of your model, "<http://localhost:port"> with your local host IP and port, and "fake" with your API key if required.

If you're still encountering issues, please ensure the following:

Your local server is running and accessible at the provided api_base.
The model you're trying to use is available and correctly specified.
If your server requires an API key, make sure you're providing the correct one.

For more details, you can refer to the LlamaIndex documentation ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

TTech explorer

@kapa.ai still getting error subclass of Base component expected

kkapa.ai

Let me check the knowledge base for you. One moment! :books:

kkapa.ai

----------------------
The error message "subclass of Base component expected" typically occurs when you're trying to use a class or object that doesn't inherit from the BaseComponent class in a context where a BaseComponent subclass is expected.

In LlamaIndex, many classes like QueryComponent, BaseLLMComponent, LLMChatComponent, CustomLLM, and others inherit from BaseComponent. These classes have specific methods and properties that LlamaIndex expects to be present when working with components.

If you're trying to use a custom class or object as a component, make sure it inherits from BaseComponent or one of its subclasses. Here's a basic example of how to create a custom component:

Plain Text

from llama_index.core.base.query_pipeline.query import BaseComponent

class MyCustomComponent(BaseComponent):
    # Your custom code here
    pass

If you're still encountering issues, please refer to the relevant parts of the LlamaIndex documentation for more information on how to correctly use and create components:

BaseComponent ()
QueryComponent ()
BaseLLMComponent ()
LLMChatComponent ()
CustomLLM ()

----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------

Add a reply

Find answers from the community

how to use llama CPP server in llamaindex . I launched openapi compatible llamacpp server.