Find answers from the community

Updated 2 years ago

Hi LlamaIndex team maybe you have an

j
S
I
6 comments
hey! good/timely question, we're working on an integration/demo early next week 🙂
very nice! I'm also exploring https://github.com/go-skynet/LocalAI would like to deploy on AWS EC2 or Fargate spot instances, not sure yet how Ray can help here
@jerryjliu0Hey Jerry, is this doable now?
i was trying to embed lot of documents and its very slow so was trying to use Ray to do batch processing but having no luck
https://github.com/amogkam/llama_index_ray/blob/main/create_vector_index.py
so i found this but im running into an issue where it just stops.
Plain Text
2023-08-08 21:00:39,182 WARNING dataset.py:4390 -- The map, flat_map, and filter operations are unvectorized and can be very slow. If you're using a vectorized transformation, consider using .map_batches() instead.
2023-08-08 21:00:39,186 INFO streaming_executor.py:91 -- Executing DAG InputDataBuffer[Input] -> TaskPoolMapOperator[FlatMap->FlatMap] -> ActorPoolMapOperator[MapBatches(EmbedNodes)]
2023-08-08 21:00:39,187 INFO streaming_executor.py:92 -- Execution config: ExecutionOptions(resource_limits=ExecutionResources(cpu=None, gpu=None, object_store_memory=None), locality_with_output=False, preserve_order=False, actor_locality_enabled=True, verbose_progress=False)
2023-08-08 21:00:39,187 INFO streaming_executor.py:94 -- Tip: For detailed progress reporting, run ray.data.DataContext.get_current().execution_options.verbose_progress = True
2023-08-08 21:00:39,205 INFO actor_pool_map_operator.py:114 -- MapBatches(EmbedNodes): Waiting for 4 pool actors to start...
Running: 0.0/48.0 CPU, 0.0/4.0 GPU, 0.0 MiB/3.48 GiB object_store_memory:   0%|                                                                                                                                                          | 0/200 [00:03<?, ?it/s]2023-08-08 21:00:47,755 INFO streaming_executor.py:149 -- Shutting down <StreamingExecutor(Thread-4, stopped daemon 140500325627648)>.
2023-08-08 21:00:47,756 WARNING actor_pool_map_operator.py:264 -- To ensure full parallelization across an actor pool of size 4, the specified batch size should be at most 0. Your configured batch size for this operator was 100.
Storing Ray Documentation embeddings in vector index.
Add a reply
Sign up and join the conversation on Discord