The community member is asking how to connect their llama_index to a locally hosted Llama.cpp server API. Another community member suggests using the openai_like code from the llama_index repository or implementing a custom LLM and manually sending the requests. The community member is trying to connect to their locally hosted Llama.cpp API running on a different machine, which is serving a Llama 2 based model. The community members discuss whether replacing llm = LlamaCPP with the openai_like code would work, but there is no definitive answer provided.
im trying to connect to my llama.cpp api I have running locally on a different machine. I compiled the llama.cpp source to run with clblas running 6 gpus. my llama.cpp instance is serving a llama 2 based model.