Skip to main content

6 posts tagged with "data-connector"

Data connector tools and integrations

View All Tags

Spice v1.2.1 (May 6, 2025)

ยท 5 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.2.1! ๐Ÿ”ฅ

Spice v1.2.1 includes several data connector fixes and improves query performance for accelerated views. This release also introduces Databricks Service Principal (M2M OAuth) authentication and expands parameterized queries.

Highlights in v1.2.1โ€‹

  • Databricks Service Principal Support: Databricks datasets and catalogs now support Machine-to-Machine (M2M) OAuth authentication via Service Principals, enabling secure machine connections to Databricks.

Example spicepod.yaml:

datasets:
- from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
mode: delta_lake
databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}

For details, see documentation for:

  • Databricks Data Connector

  • Databricks Unity Catalog Connector

  • Iceberg Data Connector: Now supports cross-account table access via the AWS Glue Catalog Connector and fixes an issue when querying data from append mode datasets.

  • Iceberg Catalog API: Full compatibility with the Iceberg HTTP REST Catalog API to consume Spice datasets from Iceberg Catalog clients.

For details, see documentation for:

  • Iceberg Data Connector

  • S3 Data Connector

  • Improved Parameterized Query Support: Expanded type inference for placeholders in:

    • IN list expressions
    • LIKE patterns
    • SIMILAR TO patterns
    • LIMIT clauses
    • Subqueries

New Contributors ๐ŸŽ‰โ€‹

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

New recipes for:

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.2.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.1 image:

docker pull spiceai/spiceai:1.2.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

  • No major dependency changes.

Changelogโ€‹

- Fix: Specify metric type as a dimension for testoperator by [@peasee](https://github.com/peasee) in [#5630](https://github.com/spiceai/spiceai/pull/5630)
- Fix: Add option to run dispatch schedule by [@peasee](https://github.com/peasee) in [#5631](https://github.com/spiceai/spiceai/pull/5631)
- Infer placeholder datatype for InList, Like, and SimilarTo by [@kczimm](https://github.com/kczimm) in [#5626](https://github.com/spiceai/spiceai/pull/5626)
- Add QA analytics for 1.2.0 by [@phillipleblanc](https://github.com/phillipleblanc) in [#5640](https://github.com/spiceai/spiceai/pull/5640)
- Fix: Use SPICED_COMMIT for spiced_commit_sha by [@peasee](https://github.com/peasee) in [#5632](https://github.com/spiceai/spiceai/pull/5632)
- New crates/tools by [@Jeadie](https://github.com/Jeadie) in [#5121](https://github.com/spiceai/spiceai/pull/5121)
- Update openapi.json by [@github-actions](https://github.com/github-actions) in [#5643](https://github.com/spiceai/spiceai/pull/5643)
- Enable metrics reporting for models benchmarks (evals) by [@sgrebnov](https://github.com/sgrebnov) in [#5639](https://github.com/spiceai/spiceai/pull/5639)
- Implement CatalogBuilder, add app and runtime references to catalog component, add runtime reference to connector params by [@ewgenius](https://github.com/ewgenius) in [#5641](https://github.com/spiceai/spiceai/pull/5641)
- Fix eventing bug in LLM progress; Add tool and worker progress by [@Jeadie](https://github.com/Jeadie) in [#5619](https://github.com/spiceai/spiceai/pull/5619)
- Handle small precision differences in TPCH answer validation by [@phillipleblanc](https://github.com/phillipleblanc) in [#5642](https://github.com/spiceai/spiceai/pull/5642)
- Add TokenProviderRegistry to the runtime by [@ewgenius](https://github.com/ewgenius) in [#5651](https://github.com/spiceai/spiceai/pull/5651)
- Provide ModelContextLayer for evals by [@Jeadie](https://github.com/Jeadie) in [#5648](https://github.com/spiceai/spiceai/pull/5648)
- Databricks data_components refactor. Databricks Spark connect - add set_token method and writable spark session by [@ewgenius](https://github.com/ewgenius) in [#5654](https://github.com/spiceai/spiceai/pull/5654)
- Extract AWS Glue warehouse for cross-account Iceberg tables by [@phillipleblanc](https://github.com/phillipleblanc) in [#5656](https://github.com/spiceai/spiceai/pull/5656)
- Refactor Dataset component by [@phillipleblanc](https://github.com/phillipleblanc) in [#5660](https://github.com/spiceai/spiceai/pull/5660)
- Fix Iceberg API returning 404 when schema contains a Dictionary by [@phillipleblanc](https://github.com/phillipleblanc) in [#5665](https://github.com/spiceai/spiceai/pull/5665)
- Fix dependencies: downgrade swagger-ui to v8; force zip to 2.3.0 by [@kczimm](https://github.com/kczimm) in [#5664](https://github.com/spiceai/spiceai/pull/5664)
- Add DuckDB indexes spicepod, additional dispatches by [@peasee](https://github.com/peasee) in [#5633](https://github.com/spiceai/spiceai/pull/5633)
- Update readme: update data federation link by [@nuvic](https://github.com/nuvic) in [#5673](https://github.com/spiceai/spiceai/pull/5673)
- Support metadata columns for object-store based data connectors by [@phillipleblanc](https://github.com/phillipleblanc) in [#5661](https://github.com/spiceai/spiceai/pull/5661)
- Add model name to LLM judges, and add model_graded_scoring task by [@Jeadie](https://github.com/Jeadie) in [#5655](https://github.com/spiceai/spiceai/pull/5655)
- Add SF1000 TPCH test spicepods for delta lake by [@Sevenannn](https://github.com/Sevenannn) in [#5606](https://github.com/spiceai/spiceai/pull/5606)
- Validate Github Connector resource existence before building the github connector graphql table by [@Sevenannn](https://github.com/Sevenannn) in [#5674](https://github.com/spiceai/spiceai/pull/5674)
- Remove hard-coded embedding performance tests in CI by [@Sevenannn](https://github.com/Sevenannn) in [#5675](https://github.com/spiceai/spiceai/pull/5675)
- Databricks M2M auth for spark connect data connector by [@ewgenius](https://github.com/ewgenius) in [#5659](https://github.com/spiceai/spiceai/pull/5659)
- Enable federated data refresh support for accelerated views by [@sgrebnov](https://github.com/sgrebnov) in [#5677](https://github.com/spiceai/spiceai/pull/5677)
- Add pods watcher integration test by [@Sevenannn](https://github.com/Sevenannn) in [#5681](https://github.com/spiceai/spiceai/pull/5681)
- Add m2m support for databricks delta connector by [@ewgenius](https://github.com/ewgenius) in [#5680](https://github.com/spiceai/spiceai/pull/5680)
- Update end_game.md by [@sgrebnov](https://github.com/sgrebnov) in [#5684](https://github.com/spiceai/spiceai/pull/5684)
- Update StaticTokenProvider to use SecretString instead of raw str value by [@ewgenius](https://github.com/ewgenius) in [#5686](https://github.com/spiceai/spiceai/pull/5686)
- Add M2M Auth support for Databricks catalog connector by [@ewgenius](https://github.com/ewgenius) in [#5687](https://github.com/spiceai/spiceai/pull/5687)
- Update UX to disable acceleration federation by [@sgrebnov](https://github.com/sgrebnov) in [#5682](https://github.com/spiceai/spiceai/pull/5682)
- Improve placeholder inference (LIMIT & Expr::InSubquery) by [@phillipleblanc](https://github.com/phillipleblanc) in [#5692](https://github.com/spiceai/spiceai/pull/5692)
- Tweak default log to ignore aws_config::imds::region by [@phillipleblanc](https://github.com/phillipleblanc) in [#5693](https://github.com/spiceai/spiceai/pull/5693)
- Make Spice properly Iceberg Catalog API compatible for load table API by [@phillipleblanc](https://github.com/phillipleblanc) in [#5695](https://github.com/spiceai/spiceai/pull/5695)
- Use deterministic queries for Databricks m2m catalog tests by [@ewgenius](https://github.com/ewgenius) in [#5696](https://github.com/spiceai/spiceai/pull/5696)
- Support retrieving the latest Iceberg table on table scan by [@phillipleblanc](https://github.com/phillipleblanc) in [#5704](https://github.com/spiceai/spiceai/pull/5704)

Full Changelog: v1.2.0...v1.2.1

Spice v0.17.1-beta (August 5, 2024)

ยท 4 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

The v0.17.1-beta minor release focuses on enhancing stability, performance, and usability. The Flight interface now supports the GetSchema API and s3, ftp, sftp, http, https, and databricks data connectors have added support for a client_timeout parameter.

Highlights in v0.17.1-betaโ€‹

Flight API GetSchema: The GetSchema API is now supported by the Flight interface. The schema of a dataset can be retrieved using GetSchema with the PATH or CMD FlightDescriptor types. The CMD FlightDescriptor type is used to get the schema of an arbitrary SQL query as the CMD bytes. The PATH FlightDescriptor type is used to retrieve the schema of a dataset.

Client Timeout: A client_timeout parameter has been added for Data Connectors: ftp, sftp, http, https, and databricks. When defined, the client timeout configures Spice to stop waiting for a response from the data source after the specified duration. The default timeout is 30 seconds.

datasets:
- from: ftp://remote-ftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
# Example client timeout
client_timeout: 30s
ftp_user: my-ftp-user
ftp_pass: ${secrets:my_ftp_password}

Breaking Changesโ€‹

TLS is now required to be explicitly enabled. Enable TLS on the command line using --tls-enabled true:

spice run -- --tls-enabled true --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or in the spicepod.yml with enabled: true:

runtime:
tls:
# TLS explicitly enabled
enabled: true
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem

Contributorsโ€‹

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @peasee
  • @Sevenannn

What's Changedโ€‹

Dependenciesโ€‹

  • Rust: Upgraded from v1.79.0 to v1.80.0

Commitsโ€‹

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.0-beta...v0.17.1-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.17-beta (July 29, 2024)

ยท 6 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the first beta release of Spice.ai OSS! ๐ŸŽ‰

The core Spice runtime has graduated from alpha to beta! Components, such as Data Connectors and Models, follow independent release milestones. Data Connectors graduating from alpha to beta include databricks, spiceai, postgres, s3, odbc, and mysql. From beta to 1.0, project will be to on improving performance and scaling to larger datasets.

This release also includes enhanced security with Transport Layer Security (TLS) secured APIs, a new spice install CLI command, and several performance and stability improvements.

Highlights in v0.17-betaโ€‹

  • Encryption in transit with TLS: The HTTP, gRPC, Metrics, and OpenTelemetry (OTEL) API endpoints can be secured with TLS by specifying a certificate and private key in PEM format.

Enable TLS using the --tls-certificate-file and --tls-key-file command-line flags:

spice run -- --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or configure in the spicepod.yml:

runtime:
tls:
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem

Get started with TLS by following the TLS Sample. For more details see the TLS Documentation.

  • spice install: Running the spice install CLI command will download and install the latest version of the runtime.
spice install
  • Improved SQLite and DuckDB compatibility: The SQLite and DuckDB accelerators support more complex queries and additional data types.

  • Pass through arguments from spice run to runtime: Arguments passed to spice run are now passed through to the runtime.

  • Secrets replacement within connection strings: Secrets are now replaced within connection strings:

datasets:
- from: mysql:my_table
name: my_table
params:
mysql_connection_string: mysql://user:${secrets:mysql_pw}@localhost:3306/db

Breaking Changesโ€‹

The odbc data connector is now optional and has been removed from the released binaries. To use the odbc data connector, use the official Spice Docker image or build the Spice runtime from source.

To build Spice from source with the odbc feature:

cargo build --release --features odbc

To use the official Spice Docker image from DockerHub:

# Pull the latest official Spice image
docker pull spiceai/spiceai:latest

# Pull the official v0.17-beta Spice image
docker pull spiceai/spiceai:0.17.0-beta

Contributorsโ€‹

  • @y-f-u
  • @peasee
  • @digadeesh
  • @phillipleblanc
  • @ewgenius
  • @sgrebnov
  • @Sevenannn
  • @lukekim

What's Changedโ€‹

Dependenciesโ€‹

Commitsโ€‹

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.16.0-alpha...v0.17-beta

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.15-alpha (July 1, 2024)

ยท 4 min read
Luke Kim
Founder and CEO of Spice AI

The v0.15-alpha release introduces support for streaming databases changes with Change Data Capture (CDC) into accelerated tables via a new Debezium connector, configurable retry logic for data refresh, and the release of a new C# SDK to build with Spice in Dotnet.

Highlights in v0.15-alphaโ€‹

  • Debezium data connector with Change Data Capture (CDC): Sync accelerated datasets with Debezium data sources over Kafka in real-time.

  • Data Refresh Retries: By default, accelerated datasets attempt to retry data refreshes on transient errors. This behavior can be configured using refresh_retry_enabled and refresh_retry_max_attempts.

  • C# Client SDK: A new C# Client SDK has been released for developing applications in Dotnet.

Debezium data connector with Change Data Capture (CDC)โ€‹

Integrating Debezium CDC is straightforward. Get started with the Debezium CDC Sample, read more about CDC in Spice, and read the Debezium data connector documentation.

Example Spicepod using Debezium CDC:

datasets:
- from: debezium:cdc.public.customer_addresses
name: customer_addresses_cdc
params:
debezium_transport: kafka
debezium_message_format: json
kafka_bootstrap_servers: localhost:19092
acceleration:
enabled: true
engine: duckdb
mode: file
refresh_mode: changes

Data Refresh Retriesโ€‹

Example Spicepod configuration limiting refresh retries to a maximum of 10 attempts:

datasets:
- from: eth.blocks
name: blocks
acceleration:
refresh_retry_enabled: true
refresh_retry_max_attempts: 10
refresh_check_interval: 30s

Breaking Changesโ€‹

None.

New Contributorsโ€‹

Contributorsโ€‹

What's Changedโ€‹

Dependenciesโ€‹

No major dependency updates.

Commitsโ€‹

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.1-alpha...v0.15.0-alpha

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.

Spice v0.14.1-alpha (June 24, 2024)

ยท 4 min read
Luke Kim
Founder and CEO of Spice AI

The v0.14.1-alpha release is focused on quality, stability, and type support with improvements in PostgreSQL, DuckDB, and GraphQL data connectors.

Highlightsโ€‹

  • PostgreSQL acceleration and data connector: Support for Composite Types and UUID data types.
  • DuckDB acceleration and data connector: Support for LargeUTF8 and DuckDB functions.
  • GraphQL data connector: Improved error handling on invalid query syntax.
  • Refresh SQL: Improved stability when overwriting STRUCT data types.

Breaking Changesโ€‹

None.

New Contributorsโ€‹

Contributorsโ€‹

  • @lukekim
  • @y-f-u
  • @ewgenius
  • @phillipleblanc
  • @Jeadie
  • @sgrebnov
  • @gloomweaver
  • @phungleson
  • @peasee
  • @digadeesh

What's Changedโ€‹

Dependenciesโ€‹

No major dependency updates.

Commitsโ€‹

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.0-alpha...v0.14.1-alpha

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Discord or by email to get involved.