Find answers from the community

Updated 3 months ago

Custom LLM

I'm making the course of Llama index on deeplearning.ai and my question is if it is possible to make a RAG with an Open source LLM like dolphin mistral. All the examples I can find so far is always using OpenAI but no information to make RAGs with LlamaIndex with other models. Appreciate any link or suggestion that helps me clarify this. Thanks!
E
L
z
9 comments
Yes, if it's not implemented, you can implement it and use with llamaindex

We have a base LLM module that facilitate a lot

https://docs.llamaindex.ai/en/stable/module_guides/models/llms/usage_custom.html#example-using-a-custom-llm-model-advanced
I recommend using ollama for the easiest setup for local envs πŸ™‚ vLLM is a great choice for actual production deployment
Nope, Ollama has exactly this problem. The limitation of the context window is a pain in the ass. Talked already to the creators.
Thanks. Will take a look.
context window limitation is a problem with all LLMs? llama-index works around it though

Most LLMs from ollama will have 4096 context windows (as most open-source LLMs do these days)
Oh, so it would work with LlamaIndex to not have this context window limitation? That is awesome! On Langchain it isn't possible
Yea, llama-index has various ways to sneak around these limitations.

For example, the default response mode of a query engine is "compact and refine" -- pack each LLM call with as much text as possible, and refine an answer to a query across 1+ LLM calls.
hopefully, you don't hit context window errors often with llama-index, and if you do, there's likely a setting that will fix it πŸ™‚
Great! This brings me hope again in humankind.
Add a reply
Sign up and join the conversation on Discord