Find answers from the community

Updated 3 months ago

Entity

Plain Text
entity_extractor = EntityExtractor(prediction_threshold=0.2,label_entities=False, device="cpu")

        node_parser = SentenceSplitter(chunk_overlap=200,chunk_size=2000)

        transformations = [node_parser, entity_extractor]

        documents = SimpleDirectoryReader(input_dir=r"Text_Files").load_data()

        pipeline = IngestionPipeline(transformations=transformations)

        nodes = pipeline.run(documents=documents)

        service_context = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo", temperature=0),embed_model=embed_model)

        index = VectorStoreIndex(nodes, service_context=service_context)
Can I speed up this entity extraction process? It's very slow. Takes about an hour or so for 300 files.
L
k
3 comments
The Entity model is running on CPU, so that will be the main cause of being slow
Not really any way to speed it up besides using GPU for the Entity model
Add a reply
Sign up and join the conversation on Discord