Find answers from the community

Updated 3 months ago

Custom LLM

Hello. I apologize if this is a trivial question, but I'm having difficulty with it. Is it possible to create a ServiceContext that can access a remote LLM (I am using LlamaCPP with the built-in CustomLLM implementation)?

Currently, I am working in a standalone environment where the index and model are in the same process. However, I now want to run the LLM server on a separate PC. Are there any pre-existing adapters available, or should I develop this adapter myself?

5 comments

WWhiteFang_Jr

Yes you can use customLLM class from llamaindex. More details here: https://gpt-index.readthedocs.io/en/stable/core_modules/model_modules/llms/usage_custom.html#example-using-a-custom-llm-model-advanced

SSergey

Okay, it seems like this shouldn't be difficult. I was thinking of trying to find ready-made solutions.

SSergey

thank you

WWhiteFang_Jr

Yes its more of a ready-made only. YOu just have to add the interaction part of code under.

Plain Text

    @llm_completion_callback()
    def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
        prompt_length = len(prompt)
        response = pipeline(prompt, max_new_tokens=num_output)[0]["generated_text"]

        # only return newly generated tokens
        text = response[prompt_length:]
        return CompletionResponse(text=text)

And voila! 💪

SSergey

Yes, thank you. I have wrapped LLM with a Flask REST server and I am sending requests to it using this 'complete' method.

Add a reply