What does this writing signal mean?

Microsoft published mimalloc: A new, high-performance, scalable memory allocator for the modern era. This talking signal gives public context for research themes, product direction, policy, or launch framing. High-signal details: Solid performance allocator release with moderate HN traction. · mimalloc: A new, high-performance, scalable memory allocator for the modern era - Microsoft Research Skip to main content Research Publications Code & data People.... onlylabs links this event to 1 captured evidence page and 6 related writing signals. It also maps to Data demand in the data-business radar.

Microsoft Writing: mimalloc: A new, high-performance, scalable memory allocator for the modern era

Captured source

source ↗

microsoft.com/microsoft.com/en-us/research

mimalloc: A new, high-performance, scalable memory allocator for the modern era

Source ↗

published May 13, 2026seen Jun 5captured Jun 7http 200method plain

mimalloc: A new, high-performance, scalable memory allocator for the modern era - Microsoft Research

Research

Publications Code & data People Microsoft Research blog

Artificial intelligence Audio & acoustics Computer vision Graphics & multimedia Human-computer interaction Human language technologies Search & information retrieval

Data platforms and analytics Hardware & devices Programming languages & software engineering Quantum computing Security, privacy & cryptography Systems & networking

Algorithms Mathematics

Ecology & environment Economics Medical, health & genomics Social sciences Technology for emerging markets

Academic programs Events & academic conferences Microsoft Research Forum

Behind the Tech podcast Microsoft Research blog Microsoft Research Forum Microsoft Research podcast

About Microsoft Research Careers & internships People Emeritus program News & awards Microsoft Research newsletter

Africa AI for Science AI Frontiers Asia-Pacific Cambridge Health Futures India Montreal New England New York City Redmond

Applied Sciences Mixed Reality & AI - Cambridge Mixed Reality & AI - Zurich

Microsoft Security Azure Dynamics 365 Microsoft 365 Microsoft Teams Windows 365

Microsoft AI Azure Space Mixed reality Microsoft HoloLens Microsoft Viva Quantum computing Sustainability

Education Automotive Financial services Government Healthcare Manufacturing Retail

Find a partner Become a partner Partner Network Microsoft Marketplace Software companies

Blog Microsoft Advertising Developer Center Documentation Events Licensing Microsoft Learn Microsoft Research

View Sitemap

Return to Blog Home Microsoft Research Blog

At a glance

Today’s critical services and applications are often highly concurrent, using hundreds of threads. They also operate at large memory scales, frequently hundreds of gigabytes, especially when using large language models.

mimalloc is an open-source, modern, scalable memory allocator that is a drop-in replacement for malloc and free. It is relatively small (~12K lines), with clear internal data structures, and is easy to build and integrate into other projects. It provides bounded worst-case allocation times (up to OS primitives), bounded space overhead, low internal fragmentation, and minimal contention by relying almost exclusively on atomic operations.

mimalloc is available on GitHub (opens in new tab) and has over 12K stars.

mimalloc

At the RiSE group at Microsoft Research (MSR) , we conduct fundamental research into formal methods, programming languages, and software engineering (including emerging agentic systems), with a particular focus on systems that can be provably correct, secure, and performant. The mimalloc memory allocator was initially designed in 2020 as a fast allocator for the state-of-the-art Lean (opens in new tab) and Koka (opens in new tab) programming languages developed at RiSE, both of which use novel compiler-guided reference counting (see Perceus ).

The scalable design of mimalloc has also proved to work exceedingly well for large services at Microsoft. Through close cooperation with product teams, mimalloc has significantly improved the response times in services such as Bing. Today, mimalloc is widely used in large services and applications, both within and outside Microsoft. It serves as the allocator for NoGIL CPython 3.13+, is integrated into Unreal Engine, and is used in games such as Death Stranding . The project is open source on GitHub, with over 12K stars its Rust wrapper alone sees over 100K downloads per day.

mimalloc is effective across a wide range of scenarios; from small-scale applications like Koka or Lean, to large services with memory footprints exceeding 500 GiB and hundreds of threads.

Despite this range, the codebase is still compact, at around 12K lines of C. Reflecting its research origins, mimalloc emphasizes clear internal data structures with strong invariants, making it easier to understand and reason about than many industry allocators. As Fred Brooks already remarked in his famous book The Mythical Man-Month : “ S how me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t need your flowchart; it’ll be obvious.”

As a result, mimalloc has been ported to many platforms—Windows, macOS, Linux, FreeBSD, NetBSD, DragonFly, and various consoles—, and is easy to build and integrate into other projects. For example, the clear data structures enabled Sam Gross and others to adopt mimalloc as the concurrent allocator for NoGIL CPython. The design also makes it relatively straightforward to implement cyclic garbage collection on top of this.

The Fast Path

As with other scalable allocators (such as tcmalloc and jemalloc), a core design principle of mimalloc is that each thread maintains its own thread-local heap, which we call a “theap”. Each theap owns a set of mimalloc “pages,” which are usually 64 KiB. Each mimalloc page contains blocks of a fixed size, organized into size classes to reduce internal fragmentation. By giving each thread its own theap and set of mimalloc pages, memory allocation and deallocation typically proceed without synchronization. Atomic operations are only required when a thread frees a block allocated by another thread.

Moreover, in practice, most allocations are quite small, often less than 1 KiB. For such small allocations, mimalloc provides a fast path where the main allocation function looks like:

void * mi_malloc( size_t size ) { mi_theap_t * const theap = mi_get_thread_local_theap(); if (size > MI_MAX_SMALL_SIZE ) return mi_malloc_generic(theap,size); // slow generic path

const size_t index = (size + sizeof ( void *))/ sizeof ( void *); // round size mi_page_t * const page = theap->small_pages[index];

mi_block_t * const block = page->free; // head of free list if (block == NULL ) return mi_malloc_generic(theap,size); // slow generic path

page->free = block->next; // pop free list page->used++; return block; }

By using thread-local theaps, we need no atomic operations or thread synchronization. We also try to minimize the number of branches. In particular, the thread-local theap is never NULL , and we initialize it with a special empty theap with all empty pages. This way, we do not need a separate check if the theap is NULL . Similarly, the pointers in the small_pages array are never NULL , and we use again special empty pages (with page-...

Excerpt shown — open the source for the full document.

Notability

notability 6.0/10

Solid performance allocator release with moderate HN traction.