Find answers from the community

Updated 3 months ago

Custom LLM

I'm making the course of Llama index on deeplearning.ai and my question is if it is possible to make a RAG with an Open source LLM like dolphin mistral. All the examples I can find so far is always using OpenAI but no information to make RAGs with LlamaIndex with other models. Appreciate any link or suggestion that helps me clarify this. Thanks!

9 comments

EEmanuel Ferreira

Yes, if it's not implemented, you can implement it and use with llamaindex

We have a base LLM module that facilitate a lot

https://docs.llamaindex.ai/en/stable/module_guides/models/llms/usage_custom.html#example-using-a-custom-llm-model-advanced

LLogan M

I recommend using ollama for the easiest setup for local envs 🙂 vLLM is a great choice for actual production deployment

zzaedev

Nope, Ollama has exactly this problem. The limitation of the context window is a pain in the ass. Talked already to the creators.

zzaedev

Thanks. Will take a look.

LLogan M

context window limitation is a problem with all LLMs? llama-index works around it though

Most LLMs from ollama will have 4096 context windows (as most open-source LLMs do these days)

zzaedev

Oh, so it would work with LlamaIndex to not have this context window limitation? That is awesome! On Langchain it isn't possible

LLogan M

Yea, llama-index has various ways to sneak around these limitations.

For example, the default response mode of a query engine is "compact and refine" -- pack each LLM call with as much text as possible, and refine an answer to a query across 1+ LLM calls.

LLogan M

hopefully, you don't hit context window errors often with llama-index, and if you do, there's likely a setting that will fix it 🙂

zzaedev

Great! This brings me hope again in humankind.

Add a reply