Find answers from the community

Updated last year

Hello Folks πŸ€—

Hello Folks πŸ€—

I have truly off topic and random question... πŸ˜“
I know it's a basic inference question but I am willing to know which is the best way to inference in batch.

The overall goal

-> I am using t5-small model for summarization task.
-> And I have around 10 to 15 different paragraphs to be summarized in a single call.

πŸ‘¨β€πŸ’» The code I am using right now

It is generic loop code but I expect some optimization here:
Plain Text
points = [
"summarize: ABC...",
"summarize: CBA...",
"summarize: ERG...",
"summarize: RAG...",
]

summaries = []
for point in points:
    input_ids = tokenizer.encode(point, return_tensors="pt", max_length=512, truncation=True)
    output_ids = model.generate(input_ids, max_length=256, temperature=0.35, do_sample=True)
    summaries.append(tokenizer.decode(output_ids[0], skip_special_tokens=True))

This one takes time as expected.

😯 I have tried this...

Plain Text
# Passsing input ids in batch
ids = tokenizer(points, return_tensors="pt", max_length=512, padding="longest")
response = model.generate(**ids, max_length=256, temperature=0.35, do_sample=True)
tokenizer.batch_decode(response, skip_special_tokens=True)

But I am worried if the model will connect the paragraphs internally and will leak information between each. I am not sure but this way is significantly fast than the loop way.

πŸ€” My Ask

Am I doing it right? How to perform the batch inference which is so fast and all inputs are NOT talking to each other?

Is there any other way to increase the speed? (Or would I just need to use threading?)

Thanks!
L
j
4 comments
The way you tried batching it is correct. The information will not leak, you can think of it like running all paragraphs in parallel copies of the model
It will use more memory, and will be slower than a single paragraph, but thats the only downside
The most meaningful speedups will come from running on GPU
That's an incredible and complete answer mate! Thank you so much πŸ€—
Add a reply
Sign up and join the conversation on Discord