Find answers from the community

Updated 2 months ago

jeremy analytics 8398 ravitheja 0475

Thanks for your inputs! 95% of my documents are powerpoints, so i was planning on chunking slide by slide and generating an embedding per slide. is that the same concept as using sentence transformers?

my main question though is what should the GPT-Index index structure be? because of the vast amount of data, would I need to go in a mult-level tree direction? would this hinder performance?
f
r
J
7 comments
we can continue discussing here so as not to overcrowd the main channel if you'd like
I replied there itself
no problem, thanks for the reply
but may be you could simply start with simplevectorindex and see how are the results.
yeah i think that's a good plan. i'll start simple and go more robust until it is satisfactory
i think ANN is pretty good for these vector stores. you should try the "naive" approach and then refine as necessary. simple is always easier to maintain πŸ™‚
Add a reply
Sign up and join the conversation on Discord