Hello, I am facing a rate limit when I generate questions using from llama_index.llama_dataset.generator import RagDatasetGenerator dataset_generator = RagDatasetGenerator.from_documents( service_context = service_context, documents = documents, num_questions_per_chunk = 2, # set the number of questions per nodes show_progress=True, ) print(dataset_generator.generate_dataset_from_nodes())
I am NOT facing that problem when creating the indexes (I changed the batch size). Looking at the logs, the requests seems to go in parallel and does not wait
This is actually fixed, if you update llama-index. There is a workers kwarg in the constructor that limits the number of ongoing async calls at any given time