Find answers from the community

Updated 2 months ago

This is probably a dumb question, or at

This is probably a dumb question, or at the very least a very beginner question. I've been using python to play/test with LangChain, RAG and running an LLM locally (via HuggingFacePipeline/HuggingFaceHub). I'm curious...
  1. How might LlamaIndex relate or fit into this? I mean, is it a competing framework vs. LangChain? I see in the docs it can work with LangChain, so I'm not sure what the purpose of one vs. the other is.
  2. If you use LlamaIndex, can it replace LangChain or is it supposed to work with another framework for the rest of the pipeline? Like, LlamaIndex is focused on connecting to our data sources, but then you pass that info to a LangChain pipeline?
  3. Since I'm looking to run an LLM locally, Ollama has popped onto my radar. In my mental model, I'm not sure where it fits in. I don't think it's a framework like LlamaIndex/LangChain, but so far I haven't needed it (Ollama), so I'm not sure what "problem" it solves beyond using HuggingFacePipeline/HuggingFaceHub to download and use models locally.
Thanks!
L
A
9 comments
  1. Its kind of competing? LlamaIndex has a much higher focus on supporting apps that use RAG/context augmentation. We try to make it as easy as possible to get started, and easy to customize. Langchain (and llama-index) also support the concept of tools. This is usually what people mean when they say using them together (i.e. creating a langchain tool for a llama-index query engine, and using that in an agent)
  1. It can definitely replace langchain imo. But as described above, some people prefer using them together
  1. Ollama is just another LLM. Both llamaindex and langchain support using Ollama (and many others) as your LLM in your pipeline
Ollama FEELS more like a docker that only supports specific LLMs. You actually called it "another LLM" - so am I thinking about Ollama incorrectly? Their list of models https://ollama.ai/library makes think of it less like a model and more like a platform for specific LLMs
That said, so far, a coworker has run Ollama, and the responses are MUCH faster than anything I've downloaded via LangChain (regardless of model size). 🀷
Yea, ollama is like an easy way to run several different models locally

When I said Ollama is an LLM, I meant more in the abstraction sense

For example

Plain Text
from llama_index.llms import Ollama
llm = Ollama(model="llama2", request_timeout=300)
llm.complete("Hello!")
Is there a difference between using a model through Ollama vs. using it directly from HF? I'm asking because the Ollama response was so much faster, but you're also more limited by model selection. Plus, outside of the speed, it seems you're adding another service into the chain and relying on them to keep their models up to date, supported, etc.
Ollama is heavily optimized for running without a GPU -- huggingface is not

If you had a GPU, huggingface would likely be faster
Ollama is a wrapper around llama.cpp
It is true, its adding another dependency πŸ™‚ Usuaully I use ollama for local testing, and something more prod-ready for deployment (huggingface, vLLM, TGI)
gotcha. Thanks for all your help @Logan M !
Add a reply
Sign up and join the conversation on Discord