Batch

At a glance

The community members are discussing the possibility of using the LlamaIndex LLM class or other LlamaIndex abstractions to access the OpenAI/Anthropic Message Batching API. One community member suggests that the current abstractions are built around real-time interactions, but another community member believes they have a good use case for it.

The community members then discuss a proposed solution that involves submitting a batch during a pipeline process and then checking the batch status to update the nodes. The advantages mentioned are cost savings and the ability to handle offline/asynchronous scenarios where the computer can be shut off or the code can be interrupted without issue.

The proposed solution would be entirely stateless and use node IDs as batch job IDs to track the processing status of each node. This would allow the user to submit the batch and then check on it later, without needing to keep a Python script running for days.

There is no explicitly marked answer in the comments, but the community members seem to be collaborating on a potential solution to the original question.

SSnowBloom

hey all. is there a way to use the llama index LLM class (or other llama index abstractions), to access the openai/anthropic Message Batching API?

11 comments

LLogan M

Not really. All the abstractions are built around real time interactions 🤔

SSnowBloom

i think I have a pretty good use case for it.

LLogan M

I would just use the raw openai client. Or if you want to make a pr, I can review it 🙂

SSnowBloom

maybe!! but one Pr at a time eh? Here's my proposed usecase/interface:

SSnowBloom

Plain Text

# Part 1: Submit batch during pipeline - remember a documentcontextextractor can literally take # hours to run - or days in the extreme case. batch processing can cut costs by 50%!

extractor = DocumentContextExtractor(
    docstore=docstore,
    llm=llm,
    mode="submit_batch"
)

# Must be last transform in pipeline
index.update_nodes(transforms=[transform_A, transform_B, transform_C, extractor])
index.persist(...)


# Part 2: Check batch status and update nodes

# first load index
index = ...

extractor.set_mode("process_batch")
while not extractor.is_batch_complete():
    num_completed = index.update_nodes(transforms=[extractor])
    # Maybe return number of nodes updated for user feedback
    time.sleep(...)  # User controls polling frequency

# After this, user can do whatever they want with their context-enabled nodes

LLogan M

I guess the advantage here is just cost savings? Otherwise async calls will achieve similar right?

Yea. Would need to be added to the llm interference for llms that support that, the concept doesn't quite exist yet in the codebase

SSnowBloom

(the above code requires either raw api calls under the hood, using openai library, or an addition to the LLM interface to do batch processing)

SSnowBloom

yep yep. cost, and also offline/asynchronous

SSnowBloom

the computer can be shutoff or the code can crash/get interrupted and its fine

SSnowBloom

my proposed solution above would be entirely stateless too. it can use node-id as the batch job id, and just check which nodes are already processed (if they have a 'context' key) and which ones are currently waiting for procsesing and which ones are ready.

SSnowBloom

with this approach you dont need to keep a python script running for days. you just submit, then check again in a day or two

Add a reply

Find answers from the community

Batch