Working with huggingface models in property graph index...

At a glance

The community member is working on creating and displaying a Property Graph Index using the llama-index library, but is having issues getting it to work with Hugging Face models instead of OpenAI. The code runs successfully but the resulting HTML file is mostly empty. The community member has tried using different Llama and Hugging Face models, but is still encountering issues. Other community members suggest that open-source language models may struggle with this task, as it requires the model to follow instructions and output something that can be parsed. They recommend trying the latest Llama models or using a different model like Ollama, especially if the community member is using a free-tier GPU like the one on Google Colab.

Useful resources

PPogLePog

Hello there!

I'm working off the following example code from the docs for creating and displaying Property Graph Index: https://docs.llamaindex.ai/en/stable/examples/property_graph/property_graph_basic/

It works fine with OpenAI llm/embeddings, but I can't get it to work with huggingface and other models.

It will successfully save the html containing the graph, but it's mostly empty.

Here's a snippet of the kind of thing I've been trying in order to replace the LLM and embeddings:

Plain Text

%pip install -q llama-index-embeddings-huggingface
%pip install -Uq bitsandbytes
import torch
from llama_index.core import set_global_tokenizer
from transformers import AutoTokenizer
from llama_index.embeddings.huggingface import HuggingFaceEmbedding


LLAMA2_7B = "meta-llama/Llama-2-7b-hf"
selected_model = LLAMA2_7B

llm = HuggingFaceLLM(
    context_window=4096,
    max_new_tokens=2048,
    generate_kwargs={"temperature": 0.0, "do_sample": False},
    # query_wrapper_prompt=query_wrapper_prompt,
    tokenizer_name=selected_model,
    model_name=selected_model,
    device_map="auto",
    # change these settings below depending on your GPU
    model_kwargs={"torch_dtype": torch.float16, "load_in_8bit": True},
)

BGE_SMALL = "BAAI/bge-small-en-v1.5"
selected_embed_model = BGE_SMALL
embed_model = HuggingFaceEmbedding(model_name=selected_embed_model)

from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model

from llama_index.core import PropertyGraphIndex

index = PropertyGraphIndex.from_documents(
    documents,
    llm=llm,
    temperature=0.3,
    embed_model=embed_model,
    show_progress=True,
)

%pip install -q pyvis

index.property_graph_store.save_networkx_graph(name="./kg.html")

Is there a way to do this?

3 comments

LLogan M

Open source LLMSs are going to have a VERY tough time doing this. It relies on the LLM following instructions and outputting somethat that can be parsed

llama2 is VERY bad at this actually lol

LLogan M

Try using a latest llama model (I would also use ollama if you don't have a decent GPU, but thats just me, easier to setup)

PPogLePog

Thanks! I did suspect this might be the case. I tried with llama 3 but same result. I wasn't sure if it was an embeddings/llm mismatch or something. This is on google colab so using one of their free tier T4 gpus.
Attached is the unworking results I get from open source models. It just contains some of the meta data the fully working versions seem to have, but not much else.
Also tried groq, but hit rate limits. (Free tier again 🙃 )

Attachment

Add a reply

Find answers from the community

Working with huggingface models in property graph index example