Snowflake-Labs/snowflake-feature-store-online-benchmark-kit
Python
Captured source
source ↗Snowflake-Labs/snowflake-feature-store-online-benchmark-kit
Description: Reproducible benchmarks for Snowflake Feature Store online serving latency.
Language: Python
License: Apache-2.0
Stars: 1
Forks: 1
Open issues: 0
Created: 2026-05-04T18:57:02Z
Pushed: 2026-05-27T18:19:00Z
Default branch: main
Fork: no
Archived: no
README:
Snowflake Feature Store — Online Serving Benchmark Kit
Reproducible benchmarks for Snowflake Feature Store online serving performance.
Summary Results
> Disclaimer: Actual results may vary based on your workload, configuration, and region; comparisons are for illustration only and do not guarantee performance in any specific environment.
Latency Benchmarks
Postgres-Backed (100K rows, 5 features, HIGHMEM_X64_S pool, single-threaded)
| p50 | p90 | p99 | |--------|-----|-----| | 10.9ms | 12.5ms | 16.3ms |
Hybrid Table-Backed (100K rows, 10 features, SL pool, 8 threads)
| p50 | p95 | p99 | |-----|-----|-----| | 47.34ms | 72.79ms | 93.04ms |
For detailed analysis, see [RESULTS.md](RESULTS.md).
Snowflake vs Databricks Comparison
Under sustained load, Snowflake Feature Store online serving delivers 2.5x lower latency and 7x higher QPS compared to Databricks Feature Serving.

Both benchmarked on comparable instances (CU_2 Databricks, XS tier Snowflake). Databricks endpoint drops requests beyond 200 QPS. Snowflake continues to deliver sub-20ms latencies to 1500 QPS with zero failures.
Repo Overview
Three benchmark suites cover latency profiling and throughput load testing:
| Suite | Directory | What It Measures | Backend | Status | |-------|-----------|-----------------|---------|--------| | Hybrid Table Latency | latency_hybrid_table/ | Per-request latency (p50/p95/p99) | Hybrid Table | Generally Available | | Postgres Latency | latency_postgres/ | Per-request latency (p50/p90/p99) | Postgres Online Service | Public Preview | | Throughput Load Test | throughput_load_test/ | QPS scaling, batch size, feature width, mixed workloads | Snowflake v.s. Databricks | Cross-platform |
The latency suites run as headless SPCS ML Jobs via snowflake.ml.jobs.submit_directory() for the lowest and most consistent per-request numbers.
The throughput load test uses Locust to drive sustained concurrent load, measuring how latency degrades as QPS, batch size, or feature width increases and includes Databricks Feature Serving as comparison.
Prerequisites
> The following prerequisites, quick starts, best practices, and troubleshooting > sections apply to the latency benchmarks (latency_hybrid_table/ and > latency_postgres/). For the throughput load test, see > [throughput_load_test/README.md](throughput_load_test/README.md).
- Python 3.9+
"snowflake-ml-python==1.37.0"- A Programmatic Access Token (PAT)
- A named Snowflake connection in
~/.snowflake/config.toml - Snowflake account with Feature Store and SPCS support
Additional prerequisites for Postgres benchmarks
- Access to a Snowflake account with the Postgres Online Service Private Preview enabled
- Online Service status:
RUNNING(provisioned via Feature Store API) - User-level network policy allowing SPCS egress IPs for PAT auth (see [Troubleshooting](#troubleshooting))
- EAI for
*.snowflakecomputing.appegress (for direct REST endpoint access)
Quick Start: Hybrid Table Benchmarks
1. Install and configure
pip install "snowflake-ml-python==1.37.0"
Add a connection to ~/.snowflake/config.toml:
[connections.mybench] account = "YOUR_ACCOUNT" user = "YOUR_USER" authenticator = "PROGRAMMATIC_ACCESS_TOKEN" token = "YOUR_PAT" warehouse = "FS_BENCHMARK_WH" database = "FS_BENCHMARK_DB" role = "ACCOUNTADMIN"
2. Set environment variables
export SNOWFLAKE_CONNECTION_NAME=mybench export SNOWFLAKE_PAT='' export SNOWFLAKE_USER=''
3. Provision infrastructure
python latency_hybrid_table/setup_env.py
This creates (idempotently):
- Compute pool:
FS_BENCH_JOB_POOL(CPU_X64_SL, 1 node, auto-suspend 30 min) - EAIs:
FS_BENCH_JOB_EAI(PyPI egress for runtime pip install) +FS_BENCH_JOB_SVC_EAI - Stages:
FS_BENCH_JOB_PAYLOAD(job submission) +FS_BENCH_JOB_RESULTS(raw JSON) - Results table:
FS_BENCH_JOB_RESULTS_TBL - Verification: Confirms
CUSTOMER_FEATURES/v1online table is queryable
> Note: This assumes the Feature View CUSTOMER_FEATURES/v1 with > OnlineConfig(enable=True) already exists. If not, create it first using > the Snowflake ML Feature Store API.
4. Run benchmarks
# Direct SQL benchmark (8 threads, 600s warmup, 300s measurement) python latency_hybrid_table/submit_job_direct_sql.py --wait --logs # SDK benchmark (8 threads, 600s warmup, 300s measurement) python latency_hybrid_table/submit_job_sdk.py --wait --logs
Jobs typically take ~15 minutes (600s warmup + 300s measurement).
5. Query results
SELECT * FROM FS_BENCHMARK_DB.FS_BENCHMARK_SCHEMA.FS_BENCH_JOB_RESULTS_TBL ORDER BY TS DESC;
Raw latency arrays are saved as gzipped JSON to @FS_BENCH_JOB_RESULTS.
Quick Start: Postgres Online Service Benchmarks
1. Set environment variables
export SNOWFLAKE_PAT='' export SNOWFLAKE_USER=''
The scripts use the vnextqa6 named connection in ~/.snowflake/connections.toml.
2. Provision infrastructure (one-time)
python latency_postgres/setup_env.py
This creates:
- Compute pool:
FS_LAT_POOL(HIGHMEM_X64_S, 1 node, auto-suspend 1 hr) - EAIs:
FS_LAT_PYPI_EAI(PyPI egress) +FS_LAT_ONLINE_SVC_EAI(*.snowflakecomputing.appegress for REST) - Stages:
JOB_PAYLOAD+BENCHMARK_RESULTS_STAGE - Source table:
BENCHMARK_USER_FEATURES_SOURCE(100K rows, 5 float columns) - Results table:
BENCHMARK_RESULTS - Feature View:
BENCHMARK_USER_FEATURES/V1with Postgres online store (OnlineConfig(enable=True, target_lag="10s", store_type=OnlineStoreType.POSTGRES)) - Verification: Waits for offline backfill, fires sanity online lookup
3. Submit benchmark jobs
SDK (`read_feature_view`):
python latency_postgres/submit_job_sdk.py --logs
REST (Direct HTTP/2 via `httpx`, no SDK):
python…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Routine repo with minimal traction