Find answers from the community

s
F
Y
a
P
Updated last month

What s the benefit of using LlamaIndex

What's the benefit of using LlamaIndex vs comparing vectors using Cosine Similarity?
L
w
p
11 comments
GPT index provides a few key functionalities and data structures on top, using LLMs, to augment things.

You can store documents in multiple indexes, combine them in different ways, and generally provide a lot of ways to get answers to things using your data.

For example, the VectorIndex finds the most relevant document chunks in an index using vector similarity, then feeds that information to an LLM to generate a natural language response using that context information. Or, you can customize the prompt to get the LLM to do whatever you want!

This page has details on each index and what it does: https://gpt-index.readthedocs.io/en/latest/guides/index_guide.html
Compared to just using vector indexes, I think Llama index provides a lot of useful tools (and Llama index will improve as LLMs get better too!)
@Logan M Thanks for the reply!
I've read the guide and I understand the benefit in the data structures hierarchy. But talking about the very simple case of just creating a single VectorIndex with all of my data in one place, will Llama index perform any better than cosine similarities?

Llama index is very tempting to build on but my the main obstacle is my need to use a different storage backend, instead of storing indexes in memory during query time and instead of storing them on disk as JSON objects.

I'd like to use a SQL database as a backend or make use of Elasticsearch as a backend even if it means less efficient querying.
Since Llama index is already using cosine similarities, the only meaningful differences will be which embedding model you use and how you chunk/divide documents in your index.

Llama index already supports some pretty efficient vector databases, and I would recommend using one if you are dealing with a lot of data -> https://gpt-index.readthedocs.io/en/latest/how_to/vector_stores.html

Also keep in mind, if all you want is the text chunk with the best similarity, you can set response_mode="no_text" in the query call
Great! thanks a lot!

Llama index already supports some pretty efficient vector databases, and I would recommend using one if you are dealing with a lot of data

Any specifics you'd recommend? I'm totally out of the world of vector databases so I don't know any reputable ones. Big benefit for me if it's available on AWS!
Tbh it's also pretty new to me haha! But I know pinecone is pretty popular, plus it's built by the same people that made AWS Sagemaker, so it should have nice AWS support
(also sorry for being super lazy here πŸ˜‚) I'm just trying to take a shortcut
Great! thanks a lot for your help
Haha no worries, good luck! πŸ’ͺ
we're using Pinecone in production right now and so far they've been great @walid
Great to hear! will give it a try tomorrow and let you know what I think
Add a reply
Sign up and join the conversation on Discord