ForkSnowflake (Arctic)Snowflake (Arctic)published Jun 22, 2026seen 4d

Snowflake-Labs/snowflake-snowpark-data-sources

forked from sfc-gh-qding/snowflake-snowpark-data-sources

Open original ↗

Captured source

source ↗

Snowflake-Labs/snowflake-snowpark-data-sources

Description: Pip-installable DB-API 2.0 wrappers for Snowpark session.read.dbapi (staging for Snowflake-Labs)

License: Apache-2.0

Stars: 0

Forks: 0

Open issues: 0

Created: 2026-06-22T05:08:45Z

Pushed: 2026-05-27T06:01:09Z

Default branch: main

Fork: yes

Parent repository: sfc-gh-qding/snowflake-snowpark-data-sources

Archived: no

README:

snowflake-snowpark-data-sources

DB-API 2.0 wrappers that let Snowpark's session.read.dbapi(...) reader ingest from non-DB-API data sources in parallel.

> Status: Experimental. Staged under a personal account pending move > to `Snowflake-Labs` after OSS review.

How it works

Snowpark's `session.read.dbapi()` can ingest from any PEP 249 compliant Python DB-API. Many time-series, graph, and document databases ship Python clients that are *not* DB-API compliant. Each subpackage here is a thin wrapper that exposes connect(), Connection, Cursor, and description in the shape Snowpark expects, so you get parallel, partition-aware ingestion into Snowflake without writing a custom reader.

Install

The package is namespaced under snowflake.* and ships each source behind an optional dependency. Install only the source you need:

# from VCS during the staging phase
pip install "snowflake-snowpark-data-sources[influxdb_v1] @ git+https://github.com/sfc-gh-qding/snowflake-snowpark-data-sources"

Sources

| Source | Extra | Import | Status | |---|---|---|---| | InfluxDB v1 | influxdb_v1 | from snowflake.snowpark_data_sources import influxdb | Experimental |

See [examples/influxdb/](examples/influxdb/) for a runnable script.

Quick start (InfluxDB)

from snowflake.snowpark import Session
from snowflake.snowpark_data_sources import influxdb as idb

def create_influx_connection():
return idb.connect(host="...", port=8086, username="...", password="...", database="my_db")

session = Session.builder.configs({...}).create()

schema = idb.get_influxdb_schema(create_influx_connection, "my_db.autogen.cpu_usage")

df = session.read.dbapi(
create_connection=create_influx_connection,
table="my_db.autogen.cpu_usage",
custom_schema=schema,
predicates=[
"time '2025-09-15T22:40:49Z'",
],
max_workers=4,
fetch_size=50000,
)

df.write.save_as_table("influx_cpu_usage", mode="overwrite")

Writing a new wrapper

To add support for another source, see [docs/how-to-write-a-wrapper.md](docs/how-to-write-a-wrapper.md) for the minimal DB-API surface area Snowpark needs and the four common gotchas.

Background

The pattern is documented in the Snowflake Data Engineering session DE226: Connect InfluxDB to Snowflake -- How Capella Space Uses Snowpark Data Source APIs.

License

Apache License 2.0 -- see [LICENSE](LICENSE).

Contributing

Contributions require signing the Snowflake CLA. Open issues and PRs against this repo; the cla-bot will guide you through signing on your first PR.

Excerpt shown — open the source for the full document.

Notability

notability 1.0/10

Routine fork by Snowflake