RepoMicrosoftMicrosoftpublished May 26, 2026seen 5d

microsoft/fabric-spark-benchmarks

Python

Open original ↗

Captured source

source ↗

microsoft/fabric-spark-benchmarks

Description: Location for Fabric Spark benchmarks

Language: Python

License: MIT

Stars: 0

Forks: 0

Open issues: 0

Created: 2026-05-26T21:48:47Z

Pushed: 2026-05-27T19:01:37Z

Default branch: main

Fork: no

Archived: no

README:

Fabric Spark Benchmarks

Reproducible benchmarks for Microsoft Fabric Spark workloads. Each benchmark is a self-contained script with documentation, methodology, and instructions to reproduce published results.

Benchmarks

| Benchmark | Description | Blog Link | |---|---|---| | [Incremental Clustering](benchmarks/liquid-clustering/incremental-clustering/README.md) | Measures clustering performance across streaming, ETL, and analytics workloads with Incremental Liquid Clustering in Microsoft Fabric compared to the baseline Liquid Clustering algorithm from OSS delta-spark. | Incremental Liquid Clustering in Microsoft Fabric: Faster, smarter, and truly incremental |

Quick start

1. Clone this repo. 2. Navigate to a benchmark directory (for example, benchmarks/liquid-clustering/incremental-clustering/). 3. Follow the benchmark's README.md for prerequisites and run instructions.

Each benchmark can be run via a Spark Job or imported into a Fabric notebook.

Repository structure

benchmarks/
├── liquid-clustering/ # Benchmark category
│ ├── incremental-clustering/ # One directory per benchmark
│ │ ├── README.md # Methodology, parameters, how to reproduce
│ │ ├── *.py # Runnable benchmark script
│ │ └── notebooks/ # Optional Fabric notebook exports

License

[MIT](LICENSE) — Copyright (c) Microsoft Corporation.

Notability

notability 3.0/10

Routine repo from Microsoft