replicate/replicate-python-beta
Python
Captured source
source ↗replicate/replicate-python-beta
Description: Python SDK for Replicate, generated by Stainless
Language: Python
License: Apache-2.0
Stars: 3
Forks: 2
Open issues: 2
Created: 2025-04-15T17:36:51Z
Pushed: 2026-05-12T19:08:15Z
Default branch: main
Fork: no
Archived: no
README:
Replicate Python API SDK (beta)
This is the repo for Replicate's official v2 Python SDK, which provides access to Replicate's HTTP API from any Python 3.8+ application.
⚠️ The v2 SDK is currently in public beta. Check out the release notes and leave feedback on the GitHub discussion.
🤔 Looking for the legacy v1 Python client? Find it here.
Docs
- v2 beta release notes: https://github.com/replicate/replicate-python-beta/releases/tag/v2.0.0-beta.1
- v2 beta migration guide: https://github.com/replicate/replicate-python-beta/blob/main/UPGRADING.md
- v2 beta SDK reference: https://sdks.replicate.com/python
- v2 beta GitHub discussion: https://github.com/replicate/replicate-python-beta/discussions/89
- HTTP API reference: https://replicate.com/docs/reference/http
Installation
The `replicate` package is available on PyPI. Install it with pip (using the --pre flag to get the latest beta version):
pip install --pre replicate
Usage
Start by getting a Replicate API token, then set it as REPLICATE_API_TOKEN in your environment:
export REPLICATE_API_TOKEN="r8_..."
Then in your Python code, import the library and use it:
import replicate
claude = replicate.use("anthropic/claude-4.5-sonnet")
seedream = replicate.use("bytedance/seedream-4")
veo = replicate.use("google/veo-3-fast")
# Enhance a simple prompt
image_prompt = claude(
prompt="bananas wearing cowboy hats", system_prompt="turn prompts into image prompts"
)
# Generate an image from the enhanced prompt
images = seedream(prompt=image_prompt)
# Generate a video from the image
video = veo(prompt="dancing bananas", image_input=images[0])
open(video)Initialization and authentication
The library uses the REPLICATE_API_TOKEN environment variable by default to implicitly initialize a client, but you can also initialize a client explicitly and set the bearer_token parameter:
import os
from replicate import Replicate
client = Replicate(bearer_token=os.environ.get("REPLICATE_API_TOKEN"))Using replicate.use()
The use() method provides a more concise way to call Replicate models as functions, offering a more pythonic approach to running models:
import replicate
# Create a model function
flux_dev = replicate.use("black-forest-labs/flux-dev")
# Call it like a regular Python function
outputs = flux_dev(
prompt="a cat wearing a wizard hat, digital art",
num_outputs=1,
aspect_ratio="1:1",
output_format="webp",
)
# outputs is a list of URLPath objects that auto-download when accessed
for output in outputs:
print(output) # e.g., Path(/tmp/a1b2c3/output.webp)Language models with streaming
Many models, particularly language models, support streaming output. Use the streaming=True parameter to get results as they're generated:
import replicate
# Create a streaming language model function
claude = replicate.use("anthropic/claude-4.5-sonnet", streaming=True)
# Stream the output
output = claude(prompt="Write a haiku about Python programming")
for chunk in output:
print(chunk, end="", flush=True)Chaining models
You can easily chain models together by passing the output of one model as input to another:
import replicate
# Create two model functions
flux_dev = replicate.use("black-forest-labs/flux-dev")
claude = replicate.use("anthropic/claude-4.5-sonnet")
# Generate an image
images = flux_dev(prompt="a mysterious ancient artifact")
# Describe the image
description = claude(
prompt="Describe this image in detail",
image=images[0], # Pass the first image directly
)
print(description)Async support
For async/await patterns, use the use_async=True parameter:
import asyncio
import replicate
async def main():
# Create an async model function
flux_dev = replicate.use("black-forest-labs/flux-dev", use_async=True)
# Await the result
outputs = await flux_dev(prompt="futuristic city at sunset")
for output in outputs:
print(output)
asyncio.run(main())Accessing URLs without downloading
If you need the URL without downloading the file, use the get_path_url() helper:
import replicate
from replicate.lib._predictions_use import get_path_url
flux_dev = replicate.use("black-forest-labs/flux-dev")
outputs = flux_dev(prompt="a serene landscape")
for output in outputs:
url = get_path_url(output)
print(f"URL: {url}") # https://replicate.delivery/...Creating predictions without waiting
To create a prediction without waiting for it to complete, use the create() method:
import replicate
claude = replicate.use("anthropic/claude-4.5-sonnet")
# Start the prediction
run = claude.create(prompt="Explain quantum computing")
# Check logs while it's running
print(run.logs())
# Get the output when ready
result = run.output()
print(result)Current limitations
- The
use()method must be called at the module level (not inside functions or classes) - Type hints are limited compared to the standard client interface
Run a model
You can run a model synchronously using replicate.run():
import replicate
output = replicate.run(
"black-forest-labs/flux-schnell", input={"prompt": "astronaut riding a rocket like a horse"}
)
print(output)The run() method is a convenience function that creates a prediction, waits for it to complete, and returns the output. If you want more control over the prediction process, you can use the lower-level API methods.
Handling errors
replicate.run() raises ModelError if the prediction fails. You can catch this exception to handle errors gracefully:
import replicate
from replicate.exceptions import ModelError
try:
output = replicate.run(
"stability-ai/stable-diffusion-3", input={"prompt": "An astronaut riding a rainbow unicorn"}
)
except ModelError as e:
print(f"Prediction failed:...Excerpt shown — open the source for the full document.
Notability
notability 1.0/10Low stars, minor beta release