Skip to content

Streaming

Stream responses token-by-token.

Basic Streaming

for chunk in ullm.completion(
    model="gpt-4o-mini",
    messages=[...],
    stream=True
):
    print(chunk.choices[0].delta.content, end="")

Async Streaming

async for chunk in await ullm.acompletion(
    model="gpt-4o-mini",
    messages=[...],
    stream=True
):
    print(chunk.choices[0].delta.content, end="")

(More details coming soon)