Find answers from the community

Updated 4 months ago

hey how can I use my local url , which is llamacpp python server hosted locally in 8000 po

At a glance

The community member is asking how to use their local LlamaCPP Python server, hosted on port 8000, with LlamaIndex for language model inference. The comments provide some guidance, suggesting that the community member can follow the LlamaIndex documentation for using a local model, and providing an example of how to do this using the Ollama model. However, since the community member is using a LlamaCPP server, the comments suggest that they may need to create a custom LLM class to interact with their server, or use a specific LLM wrapper if one exists in LlamaIndex. There is no explicitly marked answer in the comments.

hey how can I use my local url , which is llamacpp python server hosted locally in 8000 port in llamaindex for llm inference
k
T
5 comments
Please stand by @Tech explorer as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
To use a local LLM with LlamaIndex, you can follow the example provided in the LlamaIndex documentation () for using a local model. In your case, if you have a local server running at port 8000, you would replace the example's Ollama model with your own local model.

Here's an example of how you might do this:

Plain Text
from llama_index.llms.ollama import Ollama
from llama_index.core import Settings

Settings.llm = Ollama(model="<http://localhost:8000",> request_timeout=60.0)


In this example, replace "Ollama" with the name of your local model, and replace " with the URL of your local server. The request_timeout parameter is optional and can be adjusted based on your needs.

Please note that this is a general guide and the exact code may vary depending on the specifics of your local model and server. If your local model has a specific LLM wrapper in LlamaIndex, you should use that instead of the Ollama example.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai Iam running llamacpp python server not ollama . How to use my llamacpp python server openai compatible server
Let me check the knowledge base for you. One moment! :books:
----------------------
Based on the information provided in the extracts, there isn't a specific example of using a local LlamaCPP server with LlamaIndex. However, you can generally use a local language model (LLM) with LlamaIndex as shown in the LlamaIndex documentation ().

Here's an example of how you might do this:

Plain Text
from llama_index.llms import CustomLLM
from llama_index.core import Settings

Settings.llm = CustomLLM(model="<http://localhost:8000",> request_timeout=60.0)


In this example, replace "CustomLLM" with the appropriate class for your LlamaCPP server if it exists. If not, you may need to create a custom LLM class that can interact with your LlamaCPP server. The request_timeout parameter is optional and can be adjusted based on your needs.

Please note that this is a general guide and the exact code may vary depending on the specifics of your LlamaCPP server and how it's set up to interact with LlamaIndex. If your LlamaCPP server has a specific LLM wrapper in LlamaIndex, you should use that instead of the CustomLLM example.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Add a reply
Sign up and join the conversation on Discord