Find answers from the community

Updated 2 months ago

Hey I m using llama index but have

Hey... I'm using llama index but have launched llama-cpp-python using the docker image. Now I've the llm api endpoints, but how do I use them in llama-index
b
t
L
17 comments
Hey @tonyalapatt, it uses llama-cpp by default with any index iirc.
Sample query works on the server where the model bin file is. But I want to use the API endpoint provided by llama-cpp-python docker
check out first code example
maybe model_path is what you're looking for?
I've been staring at the documents for hours now. llm = LlamaCPP(
# You can pass in the URL to a GGML model to download it automatically
model_url="https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/llama-2-13b-chat.ggmlv3.q4_0.bin",
model_path=None,
temperature=0.1,
None of the llamaCpp parameters take the API endpoint
It's only the actual bin file path
I'm not super familiar with the API Endpoint for llama-cpp, let's see if @Logan M has some insight.
ahh, you are running the API server for llama-cpp @tonyalapatt ?

We haven't gotten around to implementing that yet lol
Would love a PR for it though ❤️
Aaaah. Thought as much. Atleast now i can stop googling for it .. lol .
Let me see if I can do it. Kinda in the middle of an implementation.
Hey... another quick question. Is it also not possible to use the sub question query engine with LlamaCPP ? Even though llama2-13b-chat is initialised, I'm still getting openai key error
Hmm it shouldn't be throwing an error

Although when you setup the subquestion engine, try passing in the service context as well

https://github.com/jerryjliu/llama_index/blob/644c034a249fa359181f8ebe988b8c2b93401814/llama_index/query_engine/sub_question_query_engine.py#L91
Got it. Thanks.
Add a reply
Sign up and join the conversation on Discord