WritingReplicateReplicatepublished May 22, 2025seen 5d

Run OpenAI’s latest models on Replicate

Open original ↗

Captured source

source ↗
published May 22, 2025seen 5dcaptured 3dhttp 200method plain

Run OpenAI’s latest models on Replicate – Replicate blog

Replicate Blog

Run OpenAI’s latest models on Replicate

Posted May 22, 2025 by shridharathi

You can now run OpenAI’s latest chat, vision, and reasoning models on Replicate, including GPT-4.1, GPT-4o, and the o-series.

Here are the new models:

GPT-4.1 series : Handles long context (up to 1 million tokens). Good for large documents, full codebases, and agent workflows.

GPT-4o series : Fast, multimodal models that understand text, images, and audio.

o-series : Models built for structured reasoning in math, science, and complex problem solving.

GPT-4o-transcribe: Converts audio to text with GPT-4o. Fast, accurate, and ready for real-time use.

GPT-image-1 , DALL-E: OpenAI’s image models.

You can swap between full, mini, and nano variants to match your cost and speed needs.

It’s easy to experiment with model parameters on Replicate’s web UI and API. For example, this is how you run GPT 4.1 with our JavaScript client:

Copy

import Replicate from "replicate" ; const replicate = new Replicate ();

const input = { prompt: "Who was the 16th president of the United States?" , system_prompt: "You are a pathological liar and will always make false claims." , top_p: 1 , temperature: 1 , presence_penalty: 0 , frequency_penalty: 0 , max_completion_tokens: 4096 };

for await ( const event of replicate. stream ( "openai/gpt-4.1" , { input })) { process.stdout. write ( ${ event } ) };

In case you’re curious, here’s the response:

Copy

The 16th president of the United States was actually George Washington.

Happy building!

Next: NVIDIA H100 GPUs are here

Notability

notability 7.0/10

Significant integration, broadens access to latest models