Find answers from the community

Updated 2 months ago

how to use pretrained llm from hugging face

how to use pretrained llm from hugging face
k
W
M
10 comments
Please stand by @balanp as I search the knowledge base. I normally take about 20 seconds or so. :wave:
----------------------
To use a pre-trained LLM (Language Learning Model) from Hugging Face with LlamaIndex, you can follow these steps:

  1. Install the necessary packages. For this, you will need the transformers package for HuggingFaceLLM and huggingface_hub[inference] for HuggingFaceInferenceAPI. You can install these packages using pip:
Plain Text
!pip install "transformers[torch]" "huggingface_hub[inference]"


  1. Install LlamaIndex:
Plain Text
!pip install llama-index


  1. Import the necessary modules and set up your Hugging Face token:
Plain Text
import os
from llama_index.llms import HuggingFaceInferenceAPI, HuggingFaceLLM

HF_TOKEN: Optional[str] = os.getenv("HUGGING_FACE_TOKEN")


  1. Use the HuggingFaceLLM class to use the model locally:
Plain Text
locally_run = HuggingFaceLLM(model_name="HuggingFaceH4/zephyr-7b-alpha")


  1. Alternatively, you can use the HuggingFaceInferenceAPI class to run the model remotely on Hugging Face's servers:
Plain Text
remotely_run = HuggingFaceInferenceAPI(
    model_name="HuggingFaceH4/zephyr-7b-alpha", token=HF_TOKEN
)


  1. If you want to use the model anonymously, you can skip providing a token:
Plain Text
remotely_run_anon = HuggingFaceInferenceAPI(
    model_name="HuggingFaceH4/zephyr-7b-alpha"
)


  1. If you don't provide a model_name to the HuggingFaceInferenceAPI, Hugging Face's recommended model gets used:
Plain Text
remotely_run_recommended = HuggingFaceInferenceAPI(token=HF_TOKEN)


  1. You can then use the complete method to generate text:
Plain Text
completion_response = remotely_run_recommended.complete("To infinity, and")
print(completion_response)


Remember to change the global tokenizer to match your LLM:

```python
from llama_index import set_global_tokenizer
from transformers import AutoTokenizer
set_global_tokenizer(
AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-alpha").encode
)
```

For more details, you can refer to the source ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai how to connect llamaidex to the API of a predictive model?
Let me check the knowledge base for you. One moment! :books:
----------------------
To connect LlamaIndex to the API of a predictive model, you would typically use a Language Learning Model (LLM) integration. However, the specific steps can vary depending on the API and the model you're using.

For instance, LlamaIndex supports integrations with various models and APIs such as OpenAI, Hugging Face, PaLM, and more. It also supports running local models. Here's an example of how you can use a local LLM such as Llama 2:

Plain Text
from llama_index import ServiceContext

service_context = ServiceContext.from_defaults(llm="local")


This will use llama2-chat-13B from with LlamaCPP, and assumes you have llama-cpp-python installed. A full LlamaCPP usage guide is available ().

For more specific integrations, LlamaIndex provides guides for integrating with Monster API (), Ollama - Llama 2 7B (), Predibase (), and Replicate - Llama 2 13B () among others.

Please refer to the specific guide that matches your predictive model's API for detailed instructions. If your model's API is not listed, you may need to create a custom LLM integration. For more details on this, you can refer to the custom LLM’s How-To ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
@kapa.ai How to connect llamaidex to the API of a predictive image model?
Let me check the knowledge base for you. One moment! :books:
----------------------
To connect LlamaIndex to the API of a predictive image model, you would typically use a multi-modal model. LlamaIndex supports the use of LLaVA (Large Language and Vision Assistant) for tasks such as Retrieval Augmented Image Captioning and Multi-Modal Retrieval-Augmented Generation (RAG) using Llava-13b.

Here's an example of how you can use LLaVA with LlamaIndex:

  1. Install the necessary packages:
Plain Text
%pip install unstructured replicate
%pip install llama_index ftfy regex tqdm
%pip install git+<https://github.com/openai/CLIP.git>
%pip install torch torchvision
%pip install matplotlib scikit-image
%pip install -U qdrant_client


  1. Set up your Replicate API token:
Plain Text
import os

REPLICATE_API_TOKEN = "..."  # Your Relicate API token here
os.environ["REPLICATE_API_TOKEN"] = REPLICATE_API_TOKEN


Please note that this is a general guide and the specific steps can vary depending on the API and the model you're using. For more details, you can refer to the source ().
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------
Add a reply
Sign up and join the conversation on Discord