The community member is trying to index a large codebase (200+ repositories) using llamaindex, but the parsing process is taking a long time. They believe the splitter is slow due to its serial nature and are wondering if there is a parallelized implementation available. The comments suggest that while there is no parallelized implementation yet, the community member could try parsing/threading each repository manually. Another comment mentions a resource on the llamahub.ai website that may be helpful. One community member notes that reading the documents is faster in a single-thread, but the indexing process is taking a lot of time.
I'm trying to index an entire codebase (consists of 200+ repositories) with llamaindex and Parsing documents into nodes takes forever. I believe that the splitter is quite slow due to its being serial. Is there any parallelised implementation available?