Hey all I was wondering if the following

At a glance

The community members discuss using Hugging Face language models with the LlamaIndex library instead of OpenAI models. They share their experiences with open-source models like FLAN, OPT, and GPT-J, noting reliability issues and the need for customized prompts. One community member provides an example of using the Flan-T5-small model, but encounters an error related to the OpenAI API key. The discussion covers using custom embedding models, adjusting chunk sizes, and debugging techniques. The community members also mention the potential of the Open Assistant model and the use of open-source sentence-transformer models for embedding and similarity search.

Useful resources

llostAF

Hey all, I was wondering if the following example would work if I were to change the LLM to a hugging face one, or if its strictly openAI?

from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data4').load_data()
index = GPTSimpleVectorIndex(documents)

response = index.query("Question here")

23 comments

LLogan M

It will work, but maybe not work well, it really depends on the quality and abilities of the LLM you are using. Personally, nothing open source I've tried has worked well (FLAN, OPT, OPT-IML, GPT-J), but definitely don't let that stop you!

There's an example here for using huggingface: https://github.com/jerryjliu/gpt_index/issues/544

llostAF

Awesome, thank you! Will do, do you know what type of issues you were facing when using the open source ones? Was it the reliability aspect or something else?

LLogan M

Yea definitely reliability. As in, it was hard to get them to "follow instructions". I suspect the internal prompts need to be customized with llama index, but I didn't try

LLogan M

OPT and GPT-J also like to ramble on and on, maybe it was a temperature or top p/k setting though

llostAF

ok I see, awesome thank you!

llostAF

@Logan M hey sorry to bug you again, but when I run the following code

class FlanLLM(LLM):
model_name = "google/flan-t5-small"
pipeline = pipeline("text2text-generation", model=model_name, device="cuda")

def _call(self, prompt, stop=None):
return self.pipeline(prompt, max_length=9999)[0]["generated_text"]

def _identifying_params(self):
return {"name_of_model": self.model_name}

def _llm_type(self):
return "custom"

documents = SimpleDirectoryReader('data').load_data()
index = GPTSimpleVectorIndex(documents)
llm_predictor = LLMPredictor(llm=FlanLLM())
response = index.query("Whats is the text about? ", llm_predictor=llm_predictor)

print(response)

And I get this error

Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

llostAF

Why is it asking for an openAI API key when I pass it custom flan?

LLogan M

Hmm, Try passing in llm_predictor=llm_predictor into the index constructor as well

LLogan M

Oh!

LLogan M

You also need an embedding model for a vector index, and it uses openai by default. You can try with a list index to avoid embeddings

LLogan M

But llama index also supports custom embedding models too

llostAF

Ah I see !!!! ok thank you!

LLogan M

You can specify a custom embeddings model following this: https://gpt-index.readthedocs.io/en/latest/how_to/embeddings.html#custom-embeddings

llostAF

Weird though as I was passing the example HF one and I was getting the same error, ill try that link! Thank you

llostAF

Got it working but you are right, the results are absolutely horrible lol... Thanks for the help though!!!!!!

LLogan M

Hahaha yeaaa, glad it works though!

I hope open source models catch up soon. Hopefully some time this year! 🙏

LLogan M

Keep an eye on open assistant 👀

llostAF

Looks awesome!

BBorg1903

Hey @Logan M sorry for pitching in. From your experience, do you think open source sentence-transformer models are good enough for the embedding part? I know for synthesizing the final answer, GPT3/GPT3.5 are the best ones available. But if I want to save costs by using open source models for vectorizing and similarity search, it should be good right? Because some of the sentence transformers models are exclusively fine tuned for such tasks.

LLogan M

I've only tested on smaller datasets, but it worked pretty well from what I saw. Just keep in mind that sentence-transformers have a shorter max input size (openai is like over 8000 tokens for emebddings lol), so you may need to adjust your chunk sizes as needed

LLogan M

A good way to debug is to set response_mode="no_text"

Then when you do response = index.query(), you can check response.source_nodes to see if the closest matching nodes make sense

LLogan M

@herruser check out this thread for using any LLM (your experience may vary)

hherruser

Has anyone attempted using llama for this? Will it deliver good responses?

Add a reply

Find answers from the community

Hey all I was wondering if the following