Find answers from the community

Updated 5 months ago

I m trying to index an entire codebase

At a glance

The community member is trying to index a large codebase (200+ repositories) using llamaindex, but the parsing process is taking a long time. They believe the splitter is slow due to its serial nature and are wondering if there is a parallelized implementation available. The comments suggest that while there is no parallelized implementation yet, the community member could try parsing/threading each repository manually. Another comment mentions a resource on the llamahub.ai website that may be helpful. One community member notes that reading the documents is faster in a single-thread, but the indexing process is taking a lot of time.

Useful resources
I'm trying to index an entire codebase (consists of 200+ repositories) with llamaindex and Parsing documents into nodes takes forever. I believe that the splitter is quite slow due to its being serial. Is there any parallelised implementation available?
L
E
o
4 comments
not yet, although that's good feedback. You could parse/thread each repo manually though
Yes. What I've seen is reading the documents is faster in a single-thread but indexing is what's taking a lot of time
Could you please let me know how this is helpful?
Add a reply
Sign up and join the conversation on Discord