Custom llms

At a glance

The community members discuss how to load and use GPT4All with the GPTVectorStoreIndex from the llama_index library. They suggest wrapping the GPT4All model with an LLMPredictor and setting it in the ServiceContext. However, they note that custom LLMs may require adjusting prompt helper settings to account for different max_input_sizes, and that custom LLMs have not had great quality so far.

The community members provide sample code for loading GPT4All and setting up the ServiceContext. They also mention that the GPTVectorStoreIndex is specifically for OpenAI, and suggest using local Hugging Face embeddings instead to avoid the OpenAI RetryError.

The community members confirm that this approach of using local Hugging Face embeddings works, and one member notes that they now need to focus on improving performance.

Useful resources

ccodydh

(For more context, I have figured out how to load GPT4All using from langchain.llms import GPT4All and chat with it directly, but it seems index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context) is specifically for OpenAI?)

12 comments

LLogan M

I thiiiink you can load the model from langchain, and then wrap it with the llm predictor

LLMPredictor(llm=<langchain llm>)

Then, you can set the llm_predictor in the service context

However, be wary that other models need to have adjust prompt helper settings to account for different max_input_sizes

Plus, custom llms so far have not had great quality 😅

You can also implement any LLM using a custom LLM class

https://github.com/autratec?tab=repositories

ccodydh

from llama_index import GPTVectorStoreIndex, SimpleDirectoryReader, GPTKeywordTableIndex, LLMPredictor, ServiceContext, StorageContext, load_index_from_storage

from langchain.llms import GPT4All

from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

callbacks = [StreamingStdOutCallbackHandler()]

llm = GPT4All(model="models/ggml-gpt4all-l13b-snoozy.bin", callbacks=callbacks, verbose=True)

llm_predictor = LLMPredictor(llm=llm)

service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

ccodydh

Sorry for that horrendous formatting

ccodydh

And then:

Plain Text

documents = SimpleDirectoryReader('data', recursive = True).load_data()

index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)

index.storage_context.persist()

ccodydh

Attachment