ModelMicrosoftMicrosoftpublished May 15, 2026seen 5d

microsoft/Lens-Turbo

Open original ↗

Captured source

source ↗
published May 15, 2026seen 5dcaptured 11hhttp 200method plaintask text-to-imagelicense mitlibrary diffusersdownloads 2.4klikes 141

---

Lens is a 3.8B-parameter foundational text-to-image model designed for efficient training and fast high-resolution generation. It combines dense-caption pre-training, mixed-resolution learning, GPT-OSS multi-layer text features, and the FLUX.2 semantic VAE to reach competitive quality with substantially less training compute than larger T2I models.

This repository provides the minimal inference code for generating images from Lens DiT checkpoints.

Highlights

  • Efficient Foundation — Trained on Lens-800M, an 800M image-text corpus with long GPT-4.1 captions, maximizing information density per training batch.
  • Compact & Expressive — A 48-block MMDiT denoiser leverages FLUX.2 latents and concatenated multi-layer GPT-OSS features for stronger prompt following and multilingual generalization.
  • Flexible Resolution — Mixed-resolution training enables inference across aspect ratios from 1:2 to 2:1 and resolutions up to 1440×1440.
  • Post-trained Variants — RL tuning improves visual quality and artifact suppression; the distilled Lens-Turbo supports fast 4-step generation.

Gallery

Page 1 / 6 samples 000-005

Sample 000 · 1440x1440 A generous portion of classic British fish and chips served on a sheet of white paper, golden crispy beer-battered cod fillet alongside thick-cut chips, a wedge of lemon, mushy peas in a small dish, malt vinegar bottle nearby, wooden pub table, overhead shot

Sample 001 · 1440x1440 The iconic Big Ben clock tower and the Houses of Parliament in London at golden hour, the River Thames reflecting warm amber light, Westminster Bridge in the foreground, a classic red double-decker bus crossing, dramatic clouds lit by sunset

Sample 002 · 1440x1440 La Tour Eiffel au crépuscule vue depuis le Trocadéro, la structure en fer illuminée de milliers de lumières dorées scintillantes, le ciel passant du bleu profond au violet, les fontaines du Trocadéro au premier plan avec des reflets dorés, silhouettes de promeneurs

Sample 003 · 1248x1664 A crystal dragon soaring through an aurora borealis sky, its entire body made of transparent faceted crystal refracting the green and purple aurora light into rainbow spectra, ice particles trailing from its wings, high fantasy digital art

Sample 004 · 1664x1248 Aerial view of Yuanyang rice terraces in Yunnan province at sunrise, thousands of cascading water-filled paddies reflecting golden and pink sky colors, morning mist weaving between terrace layers, lush green hillside with scattered palm trees, drone photography

Sample 005 · 1664x1248 A green iguana basking on a moss-covered fallen log in a tropical rainforest, every scale and spine rendered in sharp detail, dewdrops clinging to its skin, a blurred waterfall and lush tropical foliage in the background, National Geographic wildlife photography style

Page 2 / 6 samples 006-011

Sample 006 · 1248x1664 Oil painting portrait of a Renaissance noblewoman in a deep blue velvet dress with pearl drop earrings, soft chiaroscuro lighting revealing delicate skin, craquelure texture on the painted surface, in the style of Vermeer

Sample 007 · 1440x1440 An artisan honey jar with a hand-illustrated vintage botanical label reading "Mountain Wildflower Honey" in brown serif letterpress-style typography with decorative flourishes, detailed ink drawings of wildflowers, clover and honeybees surrounding the text, kraft paper label on clear glass jar

Sample 008 · 1440x1440 Watercolor portrait of a thoughtful young man reading a worn leather book in a Parisian cafe, loose wet-on-wet brushstrokes bleeding into warm amber and burnt sienna washes, visible paper grain texture

Sample 009 · 1664x1248 An explorer's oak desk with an aged world map spread open, a brass sextant, leather-bound navigation journal with handwritten entries, melting candle in a copper holder, scattered compass and quill pen, warm window light, still life photography

Sample 010 · 1664x1248 New York Grand Central Terminal subway station with the classic station name "GRAND CENTRAL" spelled out in elegant white ceramic mosaic tile letters embedded in a dark green tile wall, each letter approximately eight inches tall, ornate tile border frames, the S-curve of train tracks visible

Sample 011 · 1664x1248 A ruby-throated hummingbird hovering in front of a bright red heliconia flower, wings frozen in a figure-eight pattern showing iridescent feather detail, individual water droplets suspended around the bird, high-speed macro photography with dark background

Page 3 / 6 samples 012-017

Sample 012 · 1664x1248 An old Remington typewriter with a sheet of cream-colored paper rolled into the carriage, the typed words "Chapter One: The Beginning" visible in slightly uneven Courier typeface with characteristic ink density variations, some letters slightly misaligned, warm desk lamp lighting

Sample 013 · 1664x1248 The Great Wildebeest Migration crossing the Mara River at golden hour, hundreds of animals plunging into churning water sending spray everywhere, dust clouds rising from the riverbank, dramatic backlit scene, National Geographic documentary style

Sample 014 · 1248x1664 A charming flower shop storefront window with hand-painted white script lettering on the glass reading "Fresh Flowers Daily" in flowing connected cursive with decorative swashes, roses and peonies arranged in buckets visible through the lettering, morning sunlight catching the painted letters

Sample 015 · 1248x1664 A steampunk floating sky-city built on massive gear-driven platforms, brass and copper towers connected by chain bridges, steam-powered airships and hot air balloons docking at various levels, sunset clouds below the city, detailed concept art

Sample 016 · 1664x1248 Milford Sound in New Zealand at dawn, a perfect mirror reflection of steep fjord walls on glass-still water, waterfalls streaming down thousand-foot cliffs, morning mist hovering above the water surface, panoramic landscape photography

Sample 017 · 1248x1664 An Indian Bharatanatyam classical dancer in the aramandi pose, bronze ankle bells and elaborate hand mudra gestures, rich silk costume with gold temple jewelry, captured mid-performance with dramatic stage lighting…

Excerpt shown — open the source for the full document.

Notability

notability 7.0/10

Notable Microsoft model with moderate downloads