Find answers from the community

Updated 4 months ago

Poor Performance

At a glance
Same! Over a minute for a response with 1/2 nodes passed. I have GBs of data and I used GPTSImpleVectorIndex so far. Today I initialized a Pinecone index to check performances. I was also wondering if RAM could be an issue since the index is quite heavy to keep it in memory. Glad to hear someone else with same issues. Lets keep each other updated on improvements!
R
A
L
36 comments
I wonder what other indices have you tried
I think we need some benchmarking on the performance.
I already utilized async and optimization, but doesn't really help much.
ListIndex and TreeIndex are too expensive for my use case and the simplevectorindex during testing time (ie with few docs) worked pretty well. I noticed that k top similarities and prompt engineering have the most impact on time response. Though to be fast enough it needs to produce just few tokens and that means short answers that not always are the best.
I used async as well, same results as you
What chunk size are you using? 1200 and 200 of overlap maybe is too much🧐
yeah im using the default
Im trying with faiss and would like to try other vector stores to see if they are better
have you tired langchain?
its super fast
but I need the streaming and it's not friendly
I actually started with langchain but I gave up for llama index lol
With langchain I used to have bad performance in terms of quality, but i was just starting maybe it worths give it a shot
Even though gpt index is very friendly and now i got familiar with it
oh for real?
haha. I started with llama but got to try langchain yesterday
The data loaders are much more
which is a huge plus for my use case
Oh really? I feel like llama index has loaders for nearly everything lol
Hey Logan.

I have a question about design choice.

Say I wanna query about 3 docs. First I need to construct an index, and then query. For construction, is it better to use a directory loader to load all three and output one index or individually compose each, and then use list index to compose?
It really just depends. If the three documents are very clearly about different topics, then making 3 indexes and wrapping them with a list/keyword/vector index makes sense


But if all the documents cover similar information, one index would be better
That's just how I would approach it anyways
Yeah. Performance overhead is my primary concern, and from my experiments, composable indices runs much slower
If you use a list index at the top level, it will check every sub index, so yea pretty slow

For Speed, I would use a keyword or vector index at the top. Just need to generate summary for each sub index for that to work (either using the LLM or maybe you have access to one ahead of time)
From your experience langchain is faster also with large amount of data stored?
it depends. but I think faiss is fast
I switched from simple vec store to faiss right now
Glad to hear that! Out of curiosity, how much seconds does a response take in average?
Im still bench marking. I'll update later
Btw, i found loading pdfs are extremely slow
any api that can handle this extremely fast?
(You might want to benchmark the embeddings and LLM separately. If you use `response_mode="no_text" in your query, it will only fetch the closest nodes and skip calling the LLM to generate text)
Pdf loading is done using PyPDF. There are other packages that might have better performance though, I remember seeing a github repo somewhere that benchmarked all the python pdf libraries
Thanks Logan! I'll check it out.
@Ray Li hey did you find something in terms of performance? How much time does your index need to response?
Add a reply
Sign up and join the conversation on Discord