It will work, but maybe not work well, it really depends on the quality and abilities of the LLM you are using. Personally, nothing open source I've tried has worked well (FLAN, OPT, OPT-IML, GPT-J), but definitely don't let that stop you!
There's an example here for using huggingface:
https://github.com/jerryjliu/gpt_index/issues/544Awesome, thank you! Will do, do you know what type of issues you were facing when using the open source ones? Was it the reliability aspect or something else?
Yea definitely reliability. As in, it was hard to get them to "follow instructions". I suspect the internal prompts need to be customized with llama index, but I didn't try
OPT and GPT-J also like to ramble on and on, maybe it was a temperature or top p/k setting though
ok I see, awesome thank you!
@Logan M hey sorry to bug you again, but when I run the following code
class FlanLLM(LLM):
model_name = "google/flan-t5-small"
pipeline = pipeline("text2text-generation", model=model_name, device="cuda")
def _call(self, prompt, stop=None):
return self.pipeline(prompt, max_length=9999)[0]["generated_text"]
def _identifying_params(self):
return {"name_of_model": self.model_name}
def _llm_type(self):
return "custom"
documents = SimpleDirectoryReader('data').load_data()
index = GPTSimpleVectorIndex(documents)
llm_predictor = LLMPredictor(llm=FlanLLM())
response = index.query("Whats is the text about? ", llm_predictor=llm_predictor)
print(response)
And I get this error
Did not find openai_api_key, please add an environment variable OPENAI_API_KEY
which contains it, or pass openai_api_key
as a named parameter. (type=value_error)
Why is it asking for an openAI API key when I pass it custom flan?
Hmm, Try passing in llm_predictor=llm_predictor
into the index constructor as well
You also need an embedding model for a vector index, and it uses openai by default. You can try with a list index to avoid embeddings
But llama index also supports custom embedding models too
Ah I see !!!! ok thank you!
Weird though as I was passing the example HF one and I was getting the same error, ill try that link! Thank you
Got it working but you are right, the results are absolutely horrible lol... Thanks for the help though!!!!!!
Hahaha yeaaa, glad it works though!
I hope open source models catch up soon. Hopefully some time this year! π
Keep an eye on open assistant π
Hey @Logan M sorry for pitching in. From your experience, do you think open source sentence-transformer models are good enough for the embedding part? I know for synthesizing the final answer, GPT3/GPT3.5 are the best ones available. But if I want to save costs by using open source models for vectorizing and similarity search, it should be good right? Because some of the sentence transformers models are exclusively fine tuned for such tasks.
I've only tested on smaller datasets, but it worked pretty well from what I saw. Just keep in mind that sentence-transformers have a shorter max input size (openai is like over 8000 tokens for emebddings lol), so you may need to adjust your chunk sizes as needed
A good way to debug is to set response_mode="no_text"
Then when you do response = index.query()
, you can check response.source_nodes
to see if the closest matching nodes make sense
@herruser check out this thread for using any LLM (your experience may vary)
Has anyone attempted using llama for this? Will it deliver good responses?