Find answers from the community

Updated 3 months ago

Custom LLM

Hello. I apologize if this is a trivial question, but I'm having difficulty with it. Is it possible to create a ServiceContext that can access a remote LLM (I am using LlamaCPP with the built-in CustomLLM implementation)?

Currently, I am working in a standalone environment where the index and model are in the same process. However, I now want to run the LLM server on a separate PC. Are there any pre-existing adapters available, or should I develop this adapter myself?
W
S
5 comments
Okay, it seems like this shouldn't be difficult. I was thinking of trying to find ready-made solutions.
Yes its more of a ready-made only. YOu just have to add the interaction part of code under.

Plain Text
    @llm_completion_callback()
    def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
        prompt_length = len(prompt)
        response = pipeline(prompt, max_new_tokens=num_output)[0]["generated_text"]

        # only return newly generated tokens
        text = response[prompt_length:]
        return CompletionResponse(text=text)

And voila! πŸ’ͺ
Yes, thank you. I have wrapped LLM with a Flask REST server and I am sending requests to it using this 'complete' method.
Add a reply
Sign up and join the conversation on Discord