Skip to main content

15 posts tagged with "arrow"

Apache Arrow topics and usage

View All Tags

Spice v2.0-stable (Jun 5, 2026)

ยท 94 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

53 releases since Spice 1.0-stable, Spice.ai OSS has reached the 2.0-stable milestone! ๐ŸŽ‰

Spice v2.0.0 is the next major release of Spice and a major milestone in the project's development, advancing Spice from a single-node engine into a distributed data and query platform built for enterprise AI agents. These agents need low-latency, governed access to data spread across many production systems, and because they generate their own queries autonomously, that access has to be sandboxed, observable, and able to absorb occasional heavy analytical queries without overwhelming the underlying systems. The release is headlined by multi-node distributed query, now generally available โ€” multi-active, highly-available, and object-store-native, built on Apache Ballista โ€” distributing both query execution and ingestion across executors with data-local routing and per-executor statistics for distributed join planning. Alongside it, the Spice Cayenne data accelerator is generally available, built on the Vortex compressed columnar format, with a high-throughput CDC write path, MERGE INTO, SQL-defined partitioning, inline writes, a dedicated compaction runtime, and write-path statistics for distributed join sizing. The engine also moves to DataFusion v52 with sort pushdown, a rewritten merge join, and dynamic filters, and the Spice CLI is rewritten in Rust as a single self-contained binary.

v2.0 also expands real-time and write-path capabilities across the platform: native CDC from MongoDB Change Streams and PostgreSQL WAL logical replication, durable Kafka CDC offsets, DML write-back for PostgreSQL, Snowflake, DynamoDB, Arrow, and DuckLake, DDL and MERGE INTO for Iceberg catalogs, mutual TLS across server endpoints and outbound connectors, HashiCorp Vault and Azure Key Vault secret stores, user-defined functions, hybrid search with Elasticsearch and DuckDB HNSW vector indexes, provider-aware LLM prompt caching, and the Responses API across all model providers.

Highlights in v2.0.0 include:โ€‹

  • Spice Cayenne (GA) โ€” generally available on the Vortex compressed columnar format, with WAL-staged writes, inline low-latency writes, fast-path CDC deletes, merge-on-read position deletes, composite & SQL-defined partitioning, MERGE INTO, dedicated compaction runtime, and join-sizing statistics maintained on the write path
  • Multi-Active HA Distributed Query (GA) โ€” multi-node distributed query built on Apache Ballista, with object-store-native clustering, dynamic cluster sizing, distributed ingestion, data-local query routing, per-executor table statistics for distributed join planning, and async queries via /v1/queries
  • Mutual TLS (mTLS) โ€” public mTLS for HTTP and Flight, TLS cert hot-reload, and mTLS client certificates for FlightSQL and Spice.ai connectors
  • Enterprise Authentication & Authorization โ€” OIDC bearer-token verification and Cedar-based authorization policy with per-principal row- and column-level filtering
  • New Secret Stores โ€” HashiCorp Vault and Azure Key Vault
  • CDC Sources โ€” native MongoDB Change Streams, PostgreSQL WAL logical replication, and durable Kafka CDC offsets โ€” no Debezium or Kafka middleware required
  • DML & DDL โ€” INSERT/UPDATE/DELETE write-back for PostgreSQL, Snowflake, DynamoDB, and Arrow; CREATE TABLE/DROP TABLE and MERGE INTO for Iceberg catalogs
  • User-Defined Functions โ€” SQL UDFs in spicepods, remote UDFs over HTTP, and optional geospatial ST_* UDFs
  • On-Demand Dataset Loading & Unified Query Cancellation โ€” faster startup and end-to-end cancellation across HTTP, Flight, FlightSQL, and MCP
  • Dynamic HTTP Connector โ€” OAuth2 refresh tokens, pagination, dynamic headers, subquery-driven parameters, and rate-control state persisted across restarts
  • Storage-Profile Accelerator Tuning & refresh_mode: snapshot โ€” storage-aware acceleration defaults and point-in-time snapshot acceleration
  • Search & Vectors โ€” Elasticsearch data connector with native hybrid search, DuckDB HNSW vector engine with a statically linked VSS extension, multi-vector MaxSim embeddings, and a rerank() UDTF
  • AI & LLM โ€” provider-aware prompt caching, Responses API across all providers, MCP Streamable HTTP transport, and a searchable LLM tool registry
  • New Data Connectors โ€” Elasticsearch (Alpha), GCS (Alpha), Azure Cosmos DB (Alpha), Git (RC), ADBC, DuckLake (Beta), and catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake
  • Rust CLI โ€” single-binary spice CLI with spice query async REPL, shell completions, and --output=json
  • Dependency upgrades including DataFusion v52.5, DuckDB v1.5.3, Arrow v57.2, iceberg-rust v0.9.1, Turso v0.6.1, and Vortex v0.69

Spice v2.0 includes several breaking changes. Review the breaking changes section before upgrading.

Distribution Changesโ€‹

AI/ML support including local LLM/ML model and hosted LLM inference is now included in the default Spice build and image. The separate models build variant has been removed.

With models now included by default, the data-only distribution (without AI/ML support) is only published in nightly builds. Official production-ready data-only distributions are available exclusively through Spice Cloud and the Enterprise release.

A new Network Attached Storage (NAS) distribution with built-in SMB and NFS data connector support is also available in nightly builds and with Spice.ai Enterprise.

Distribution / VariantOpen SourceSpice CloudEnterprise
Defaultโœ…โœ…โœ…
DataNightly onlyโœ…โœ…
NAS (SMB + NFS)Nightly onlyโŒโœ…
Metal (macOS)โœ…โœ…โœ…
CUDA (Linux)Nightly onlyโœ…โœ…
Allocator variantsNightly onlyโœ…โœ…
ODBC connectorLocal build onlyโœ…โœ…

Native Windows builds are no longer provided; use WSL for local development. For more details, see the Distributions documentation.

What's New in v2.0.0โ€‹

Spice Cayenne Reaches General Availabilityโ€‹

The Spice Cayenne data accelerator is generally available in v2.0, with a major focus across the release candidates on write-path throughput, correctness, and distributed operation.

Write path & ingest:

  • Staged Append Writes: WAL-based staged append writes prevent partial writes and data loss on stream errors โ€” batches commit atomically.
  • Inline Writes: Small writes are serialized as Arrow IPC and committed directly into the Cayenne metastore, bypassing the staged Vortex write path for low-latency ingest. Inline upserts atomically rewrite existing inline rows, inline data stays query-visible via an in-memory union scan, and rows are checkpointed to Vortex when thresholds are reached. Inline writes now also proceed with pending deletions in flight, and inline flush caps scale with available memory and storage class.
  • Fast-Path CDC Deletes: DELETE statements whose filters identify primary keys directly โ€” including composite keys expressed as (k1, k2) IN ((...), (...)) โ€” skip the table scan entirely.
  • Merge-On-Read Position Deletes: Primary-key upsert tables use position deletes with memory-pool accounting, avoiding full-table rewrites on update-heavy workloads.
  • Resident Upsert Keysets: CDC upsert primary-key keysets stay resident between batches, avoiding per-batch full-table rebuilds.
  • CDC Sub-Batch Efficiency: Interleaved upsert/delete workloads produce fewer sub-batch splits, with last-write-wins deduplication applied within batches.
  • Dedicated Compaction Runtime: Background compaction runs on a dedicated thread pool with CDC pipelining and protected snapshots, isolating compaction work from query and ingest paths.

Query & planning:

  • Join Filter Propagation: Filters propagate across equi-join keys, with range fallback for large join filters and IN-list rewrites.
  • Write-Path Join-Sizing Statistics: Cayenne maintains live row counts and HyperLogLog-based distinct-value estimates on the write path, so distributed JoinSelection can correctly size joins without rescans.
  • Scan-Result Cache: A new scan-result cache accelerates hot reads, with parallel Vortex partition writes and lock-free deletion caches with bloom-prefiltered probes.

SQL & catalog:

  • MERGE INTO: Upsert-style MERGE INTO for Cayenne catalog tables, distributed across executors in cluster mode.
  • PARTITION BY in SQL: Define partitioning directly in CREATE TABLE ... PARTITION BY (...); metadata is persisted in the catalog and survives restarts.
  • Composite Partitioning: partition_by: [col1, col2] with hierarchical path-like keys.
  • File-Based Retention Deletes: Time-based retention uses file-level deletes for both position-based and primary-key tables.

Correctness: Synchronized partition commits, correct NULL-sentinel handling for nullable partition expressions, tombstoned inline-checkpointed rows on upsert (preventing duplicate primary keys), and live reads through expired protected snapshots.

Multi-Active HA Distributed Query (GA)โ€‹

Spice.ai Enterprise feature. See High Availability.

Distributed Query is generally available. Built on Apache Ballista, it distributes query execution across multiple active executor nodes with no single point of failure, reading directly from object storage rather than relying on a central cluster.

Distributed query supports two execution modes:

  • Synchronous: Queries for accelerated datasets are distributed across executors and results stream back in real-time โ€” best for interactive, latency-sensitive queries.
  • Asynchronous: Queries submitted via the HTTP /v1/queries API materialize results to object storage for later retrieval โ€” best for long-running analytical and batch workloads.

Key capabilities:

  • Dynamic Cluster Sizing: The planner adjusts parallelism to the number of active executors as nodes join or leave.
  • Distributed Ingestion: Ingestion for partitioned accelerated tables is distributed across executors, with partition-aware write-through splitting scheduler-side Flight DoPut writes to the responsible executors.
  • Data-Local Query Routing: Cayenne catalog queries route to the executors holding the relevant partitions.
  • Per-Executor Table Statistics: Executors report table statistics โ€” including NDV-aware estimates โ€” so distributed JoinSelection can size joins correctly, fixing out-of-memory conditions on large semi-joins.
  • Readiness & Failure Detection: /v1/ready gates on a configurable executor quorum for safe rolling deployments; scheduler readiness additionally waits for executor partition loads; executor heartbeat timeout reduced from 180s to 30s.
  • Distributed DML & DDL: UPDATE/DELETE forwarding to all executors, executor DDL sync for late joiners, and distributed MERGE INTO.
  • Cluster Observability: New cluster metrics (including scheduler_active_executors_count), distributed runtime.task_history replication, and a Grafana dashboard.
  • Ballista S3 Shuffle: Async queries with runtime.params.shuffle_location: s3://... complete reliably with executor-environment-derived S3 clients.

Security: Mutual TLS, Secret Stores, and Hardeningโ€‹

Several capabilities in this section are Spice.ai Enterprise features. See Enterprise Security.

Mutual TLS across the platform:

  • Public mTLS for HTTP and Flight: client_auth_mode: request (optional, for migration windows) or required (strict) client-certificate verification.
  • TLS Cert Hot-Reload: The runtime reloads TLS certificates on SIGHUP for zero-downtime rotation.
  • Outbound mTLS Client Certificates: FlightSQL and Spice.ai data connectors present client certificates to upstream services; the spice sql REPL supports mTLS client auth.
runtime:
tls:
enabled: true
certificate_file: /etc/spice/tls/server.crt
key_file: /etc/spice/tls/server.key
client_auth_mode: required
client_auth_ca_file: /etc/spice/tls/client-ca.crt

Authentication & Authorization (Spice.ai Enterprise):

  • OIDC Authentication: Validate OIDC bearer tokens (JWTs) issued by enterprise identity providers โ€” Microsoft Entra ID, Okta, Auth0, AWS Cognito, and Google โ€” for secure access to runtime endpoints, standalone or combined with API keys.
  • Principal-Based Policy Enforcement: Fine-grained, Cedar-based authorization policy configured under runtime.authorization governs allow/deny access across datasets, models, tools, and endpoints. Combined with identity SQL functions (current_principal(), current_principal_email(), current_principal_groups()), policies enforce per-principal row-level filtering and column masking.

New Secret Stores: HashiCorp Vault (KV v1/v2; token, approle, kubernetes, and jwt auth with automatic lease renewal) and Azure Key Vault (service principal, managed identity, workload identity, Azure CLI, or auto-detect; sovereign cloud support).

Hardening:

  • Read-only API Key Enforcement on the Flight DoGet path and async query endpoints.
  • Per-Principal Cache Namespacing: SQL, search, and caching-accelerator caches are namespaced per authenticated principal so cached results never cross identity boundaries.
  • API Key Timing Leak & Remote-UDF SSRF: Closed a timing-based position-disclosure leak in API key comparison and blocked SSRF via remote UDF endpoints.
  • Snowflake Function Deny-List: A function deny-list is enforced in Snowflake federation pushdown, and Snowflake account identifiers and auth configuration are validated at startup.
  • MCP allowed_hosts: MCP servers can be restricted to an explicit allowlist of upstream hosts.

Change Data Capture (CDC) Sourcesโ€‹

See Change Data Capture (CDC) for an overview of CDC in Spice.

  • MongoDB Change Streams: MongoDB datasets with refresh_mode: changes stream changes natively into any local accelerator โ€” no Debezium or Kafka required.
  • PostgreSQL Native Replication (WAL): PostgreSQL datasets stream INSERT/UPDATE/DELETE directly from logical replication using pgoutput decoding, with automatic per-replica slot management, an initial REPEATABLE READ bootstrap snapshot, and durable LSN acknowledgement.
  • Kafka CDC Offset Persistence: Kafka CDC offsets persist in sidecar tables for durable, resumable streams across restarts and failovers.
  • Pipelined CDC Ingestion: Source reads overlap with batch apply, with envelope coalescing and improved nullability propagation.
  • Debezium Schema Evolution: Schema changes in Debezium-sourced datasets no longer break dataset initialization on reload.
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_db: mydb
acceleration:
enabled: true
engine: duckdb
refresh_mode: changes

DML, DDL, and Write-Backโ€‹

Spice v2.0 turns more connectors and catalogs into full read/write tables:

  • PostgreSQL DML: INSERT, UPDATE, and DELETE write-back on PostgreSQL datasets, with foreign-key metadata exposed via the PostgreSQL catalog connector.
  • Snowflake DML: INSERT, UPDATE, and DELETE write-back on Snowflake datasets.
  • DynamoDB DML: INSERT, UPDATE, and DELETE for DynamoDB, complementing read and CDC streaming.
  • Arrow Primary Key Upserts: Native update-or-insert semantics for in-memory Arrow-accelerated tables.
  • DDL for Iceberg: CREATE TABLE and DROP TABLE via FlightSQL and /v1/sql for Iceberg, with catalog.access: read_write_create.
  • DuckLake INSERT: DuckLake catalog tables with read_write access support INSERT.

SQL & User-Defined Functionsโ€‹

See the SQL Reference for the full SQL surface area.

  • User-Defined Functions: Define reusable SQL UDFs as first-class spicepod components, or invoke remote functions over HTTP (Spice.ai Enterprise), plus table user functions.
  • Spatial SQL UDFs: Optional geospatial ST_* UDFs for geometry workloads.
  • JSON UDTFs: flatten_json, json_tree, and flatten_json_properties table-valued functions for JSON transformation and schema decomposition (with options such as expand_maps). See JSON Functions and Operators.
  • PostgreSQL Metadata UDFs: Dataset and column descriptions are exposed via PostgreSQL-compatible UDFs (obj_description, col_description), so BI tools and psql surface Spice metadata.
  • FlightSQL Substrait Plans: CommandStatementSubstraitPlan support for clients submitting Substrait-encoded plans.
  • SQL REPL Expanded View: Toggle \x for a vertical key-value layout on wide result sets.
  • Prepared statement, federation, and unparsing fixes across the engine, including keeping correlated subqueries out of JOIN ON conditions for Spice Cloud federation and correct EXISTS/NOT EXISTS subquery handling in the federation analyzer.

Runtime Featuresโ€‹

  • On-Demand Dataset Loading: Datasets can be deferred โ€” registered with a declared schema at startup (columns[].type, columns[].nullable) and fully resolved on first reference, reducing startup time and memory for large spicepods.
  • Unified Query Cancellation: HTTP, Flight, FlightSQL, MCP, and internal execution paths honour a unified cancellation signal โ€” disconnects, REPL Ctrl-C, and cancelled HTTP requests cancel the query end-to-end.
  • Storage-Profile Accelerator Tuning: acceleration.storage_profile (auto, local_ssd, ebs, tmpfs) applies storage-aware defaults across DuckDB, SQLite, Turso, and Cayenne file-mode accelerators; auto detects the backing storage.
  • refresh_mode: snapshot (Spice.ai Enterprise): Point-in-time snapshot acceleration with SQLite/Turso WAL flushing and Cayenne metastore slice integration, now reporting accurate readiness when no snapshot exists yet.
  • Structured Component Errors: /v1/datasets?status=true and /v1/models?status=true return structured error objects (category, type, code) and human-readable error_message fields; the CLI shows an ERROR column.
  • Actionable Config Errors: Parameter typos, missing secret references, and unknown engine names produce specific, actionable errors with suggestions.

Spicepod v2โ€‹

Spicepods now support version: v2, the default for spice init, while v1 spicepods continue to work with automatic migration of deprecated fields.

VersionStatus
v2Default. Used by spice init.
v1Supported. Deprecated fields auto-migrate.
v1beta1Removed. No longer accepted.
v1 (deprecated)v2 (preferred)Notes
runtime.results_cacheruntime.caching.sql_resultsAll fields migrate automatically. cache_max_size โ†’ max_size.
runtime.memory_limitruntime.query.memory_limitAuto-migrated. query.memory_limit takes priority if both set.
runtime.temp_directoryruntime.query.temp_directoryAuto-migrated. query.temp_directory takes priority if both set.
dataset.invalid_type_actiondataset.unsupported_type_actionAuto-migrated. v2 adds a new string variant.

New v2 fields include runtime.ready_state, runtime.query.spill_compression, runtime.caching.sql_results.stale_while_revalidate_ttl, runtime.caching.sql_results.encoding, scheduler partition-assignment configuration, and catalog.access: read_write_create.

Data Connectors & Catalogsโ€‹

New connectors:

  • Elasticsearch (Alpha, Spice.ai Enterprise): Query Elasticsearch indexes as SQL tables with native hybrid search โ€” vector_search() kNN, text_search() BM25, and rrf() fusion โ€” plus Elasticsearch as a backing vector engine, direct FTS engine configuration, and index lifecycle controls.
  • GCS (Alpha): Federated queries against Google Cloud Storage, with Iceberg table support.
  • Azure Cosmos DB (Alpha): Read-only NoSQL / Core SQL API connector with cross-partition scans and schema inference.
  • Git (RC): HTTPS/SSH auth, Git LFS support, and per-repo connection resilience.
  • ADBC: Data connector and catalog with full query federation, BigQuery support, and schema/table discovery.
  • DuckLake (Beta): Lakehouse-style data management with DuckDB as the metadata catalog and object storage for data โ€” ACID transactions, time travel, and schema evolution on Parquet.
  • Self-Hosted Spice Connector: Connect Spice to another self-hosted Spice runtime as a federated source.

New catalog connectors for PostgreSQL, MySQL, MSSQL, and Snowflake, using native metadata catalogs for schema and table discovery. Unity Catalog compatibility extends to OSS Unity Catalog deployments, and DDL-defined catalogs can expose and query views.

HTTP connector: OAuth2 refresh-token authentication, query-parameter and no-limit pagination, dynamic request headers parameterised from query predicates, subquery-driven request parameters for fan-out queries, response metadata as queryable columns, map-to-array conversion, shared and persistent rate-control state across restarts and replicas, no caching of transient 429/5xx errors, and a correctly populated fetched_at column.

JSON ingestion: Single-object documents, JSONL, BOM-prefixed input, Socrata SODA responses, format auto-detection, and RFC 6901 json_pointer extraction of nested payloads.

Databricks: Resilience controls, Unity Catalog-aware permission prechecks with structured advisory errors, Classic SQL Warehouse foreign-table compatibility, connect_timeout/client_timeout parameters, a Databricks SQL dialect for federation, and Delta Lake column mapping (Name and Id modes).

Other connector improvements: MongoDB SRV support; MySQL mysql_zero_date_behavior; Snowflake OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, and TIMESTAMP_LTZ types plus key-pair auth; ClickHouse Date32; S3 s3_url_style for path-style addressing and faster Parquet reads; GraphQL custom auth headers; Oracle and MSSQL sort/limit pushdown; GitHub GraphQL resilience; and improved Kafka reliability.

AI & LLMโ€‹

  • Provider-Aware Prompt Caching: LLM calls automatically use provider-side prompt caching (e.g., Anthropic, OpenAI) for system prompts and tool descriptions, reducing latency and cost.
  • Responses API Across All Providers: The Responses API works with every configured model provider, including streaming response.output_text.delta events and Authorization: Bearer header support.
  • Multi-Vector Embeddings with MaxSim: List-of-string columns produce one embedding per element with MaxSim/mean/sum scoring for ColBERT-style late-interaction retrieval, plus a _match column identifying the best-matching element.
  • rerank() UDTF: Reorder results from vector_search, text_search, or rrf using any registered chat model as a reranker, with automatic query propagation and pushdown support.
  • Searchable LLM Tool Registry: Agents discover tools via semantic search instead of enumerating every tool in the system prompt.
  • MCP Improvements: Streamable HTTP transport (/v1/mcp) on rmcp v1.5.0, native auth for streamable HTTP tools (mcp_auth_token, mcp_headers), external MCP server tool calls traced in task history, and configurable allowed_hosts.
  • Per-Model Rate-Limited AI UDF Execution for controlling concurrent AI function invocations.

Search & Vectorsโ€‹

  • DuckDB Vector Engine: vector_engine: duckdb uses DuckDB's HNSW index for fast approximate nearest-neighbor search without an external vector store. In v2.0.0, the DuckDB VSS extension is statically linked into the bundled DuckDB, so HNSW vector search works out-of-the-box on clean machines with no extension download. HNSW indexes are preserved across data refresh, and cosine_distance pushes down via array_cosine_distance.
  • Hybrid Search: Combine kNN vector search and BM25 full-text search with reciprocal rank fusion (rrf()), backed by Tantivy, Elasticsearch, or DuckDB.
  • Full-Text Search Performance: Significantly faster Tantivy ingestion with rollback-on-error, and search metadata is correctly preserved on indexing and in Vortex physical schema calculation.
  • Embedding Validation: row_id columns are validated during dataset initialization.

Cachingโ€‹

Improvements across Caching:

  • Stale-While-Revalidate: runtime.caching.sql_results.stale_while_revalidate_ttl serves stale results while revalidating in the background.
  • Cache Encoding: Optional compression (e.g., zstd) for SQL results cache entries.
  • Retention Policies for cached query results, and improved CDC-driven cache invalidation (including view plan invalidation on updates).
  • Idle Cache Maintenance: Periodic maintenance drains invalidation predicates on idle caches, fixing unbounded memory growth in rarely-read caches.

Performance & Query Engineโ€‹

Apache DataFusion is upgraded to v52.5 over the course of the release cycle, bringing:

  • Sort Pushdown to Scans: ~30x faster top-K queries on pre-sorted data; Parquet scans reverse row-group order for DESC on ASC-sorted files.
  • Rewritten Sort-Merge Join: Up to three orders of magnitude faster in pathological cases (e.g., TPC-H Q21: minutes โ†’ milliseconds).
  • Dynamic Filters: MIN/MAX aggregates and hash-join build sides prune files, row groups, and rows during execution.
  • Faster CASE Expressions, statistics caching, and prefix-aware list-files caching for faster planning.
  • TableProvider DELETE/UPDATE hooks and the RelationPlanner API for extensible SQL planning.
  • Strict Overflow Handling: try_cast_to errors on overflow instead of silently producing NULLs.

Additional engine work: default query memory limit raised from 70% to 90% with GreedyMemoryPool, partial aggregation optimization for FlightSQLExec, improved partitioned query planning, and metastore transaction support to prevent concurrent conflicts.

Rust CLIโ€‹

The Spice CLI is completely rewritten from Go to Rust โ€” a single spice binary built from the same codebase as spiced, with full feature parity across 27+ commands.

  • spice query: Interactive REPL for async queries with multi-line SQL, progress indication, and cancellation.
  • spice dataset configure: Non-interactive flag-based configuration (--from, --description, --param KEY=VALUE, --set) alongside interactive prompts.
  • spice completions: Shell completion script generation.
  • --output=json: Machine-readable output for scripting; spice login --output adds env, json, and keychain modes.
  • spice init writes a yaml-language-server schema directive for IDE completions.

Observabilityโ€‹

  • OpenTelemetry: Exporter fixes, authenticated metrics export, configurable metric name prefix (runtime.telemetry.metric_prefix), delta temporality by default, and OTLP resource attributes via runtime.telemetry.properties.
  • Query Metrics: The query_executions metric gains a datasets dimension for per-dataset query attribution.
  • Ingestion Metrics: rows_written, bytes_written, and dataset_acceleration_size_bytes for acceleration refresh and Flight DoPut/ADBC ingestion, and EXPLAIN ANALYZE metrics in FlightSQLExec.
  • Task History: Distributed task history in cluster mode and tracing for external MCP server tool calls.

Notable Bug Fixesโ€‹

  • localpod synchronization: localpod child datasets correctly track parent refreshes when the parent uses the in-memory Arrow accelerator.
  • Spice Cloud federation: Correlated subqueries are kept out of JOIN ON conditions, fixing rejected federated queries.
  • refresh_mode: snapshot: No longer reports Ready with empty data when no snapshot exists.
  • Search metadata: Field and schema metadata preserved on search indexing and in Vortex physical schema calculation.
  • HTTP connector: fetched_at column is correctly populated.
  • Connector correctness: DynamoDB Streams transient-error retries and typed-NULL DML handling; ScyllaDB physical filter pushdown disabled to fix incorrect results; MSSQL TOP N pushdown; DuckDB DELETE/UPDATE on full and caching refresh modes; Turso checked arithmetic for timestamp conversions; ODBC queries no longer silently return 0 rows on failure; Flight GetFlightInfo/DoGet schema parity.

Dependency Updatesโ€‹

Dependency / ComponentVersion
DataFusionv52.5
Ballistav52
Arrow (arrow-rs)v57.2
DuckDBv1.5.3 (with statically linked VSS)
iceberg-rustv0.9.1
Turso (libsql)v0.6.1
Vortexv0.69.0
delta_kernelv0.18.2
rmcp (MCP)v1.5.0
mistral.rsv0.8.x (candle v0.10.1)
ADBC Corev0.23
Rust toolchainv1.94.1

Contributorsโ€‹

Breaking Changesโ€‹

  • Models included by default: The separate models build variant has been removed. Local LLM inference is always included in the default build and image.

  • Windows native builds removed: Use WSL for local development.

  • Spicepod version defaults to v2: spice init creates version: v2 spicepods. v1 remains supported with auto-migration; v1beta1 is no longer accepted.

  • Flattened runtime.scheduler configuration: The nested runtime.scheduler.partition_management block is flattened and renamed:

    # Before
    runtime:
    scheduler:
    partition_management:
    interval: 30s
    max_assignments_per_cycle: 16
    discovery_timeout: 10s

    # After
    runtime:
    scheduler:
    partition_assignment_interval: 30s
    max_assignments_per_interval: 16
    partition_discovery_timeout: 10s
  • S3 metadata columns renamed: location, last_modified, size โ†’ _location, _last_modified, _size.

  • Default query memory limit changed: Increased from 70% to 90%.

  • Metric renames: accelerated_refresh metrics renamed to acceleration_refresh; last_refresh_time gauge renamed to include the milliseconds unit.

  • DuckDB parameter rename: partitioned_write_flush_threshold โ†’ partitioned_write_flush_threshold_rows.

  • /v1/search API: Always returns an array in matches, even for single results.

  • /v1/evals API removed.

  • Perplexity model provider removed.

  • x.ai model endpoint: x.ai models exclusively use the /v1/responses endpoint.

Upgrade Guide from v1.xโ€‹

Most v1 spicepods continue to work on v2.0 โ€” v1 remains supported and deprecated fields auto-migrate at load time โ€” so many deployments can upgrade by updating the binary or image alone. The steps below cover the breaking changes that may require manual action. Review each before upgrading a production deployment.

1. Build, image, and platform changesโ€‹

  • Models are now included by default. The separate models build variant (and the corresponding -models image tags) has been removed; local LLM inference is always included in the default build and image. If your deployment pinned a models build or -models-tagged image, switch to the default build/image.
  • Native Windows builds are removed. Use WSL for local Windows development.

spice init now creates version: v2 spicepods. v1 spicepods remain supported with automatic migration, but v1beta1 is no longer accepted. To move to v2, set version: v2 and update the following fields โ€” each auto-migrates from v1, but updating now clears the deprecation:

v1 (deprecated)v2 (preferred)
runtime.results_cacheruntime.caching.sql_results (cache_max_size โ†’ max_size)
runtime.memory_limitruntime.query.memory_limit
runtime.temp_directoryruntime.query.temp_directory
dataset.invalid_type_actiondataset.unsupported_type_action

3. Update changed configurationโ€‹

  • DuckDB parameter rename: partitioned_write_flush_threshold โ†’ partitioned_write_flush_threshold_rows.
  • Default query memory limit raised from 70% to 90%. If you relied on the previous default to leave headroom for other processes on the host, set it explicitly via runtime.query.memory_limit.

4. Update queries and API clientsโ€‹

  • S3 metadata columns renamed: location, last_modified, size โ†’ _location, _last_modified, _size. Update any queries that reference these columns.
  • /v1/search always returns an array in matches, even for a single result. Update clients that assumed a scalar value.
  • /v1/evals API removed. Remove integrations that depend on it.

5. Update model providersโ€‹

  • Perplexity model provider removed. Re-point affected models to another provider.
  • x.ai models use the /v1/responses endpoint exclusively. Ensure x.ai integrations target the Responses API.

6. Update observabilityโ€‹

  • Metric renames: accelerated_refresh โ†’ acceleration_refresh, and the last_refresh_time gauge is renamed to include the milliseconds unit. Update dashboards and alerts that reference these metric names.

After updating, restart the runtime and verify datasets and models report ready via /v1/datasets?status=true and /v1/models?status=true (the CLI shows a Ready/ERROR column).

Cookbook Updatesโ€‹

New Spice Cookbook recipes added during the v2.0 release cycle:

The Spice Cookbook includes more than 100 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v2.0.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0 image:

docker pull spiceai/spiceai:2.0.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • Add TPC-DS integration tests with S3 source and PostgreSQL acceleration by @phillipleblanc in #9006
  • fix(tests): fix flaky/slow/failing unit tests by @phillipleblanc in #9009
  • fix: Update benchmark snapshots for DF51 upgrade by @app/github-actions in #9008
  • fix: add feature gate to rrf TEST_EMBEDDING_MODEL by @phillipleblanc in #9017
  • fix: features check by @phillipleblanc in #9014
  • fix: Enable Cayenne acceleration snapshots by @lukekim in #9020
  • URL table support by @lukekim in #9018
  • ScyllaDB key filter by @lukekim in #8997
  • fix: Schema mismatch when using column projection with HTTP caching by @phillipleblanc in #9021
  • Add more tests for HTTP caching with columns selection by @sgrebnov in #9025
  • HTTP cache snapshots: default to time_interval and fix snapshots_creation_policy: on_change by @sgrebnov in #9026
  • Fix duplicate snapshot creation on startup by @sgrebnov in #9029
  • Add ScyllaDB and SMB to the README table by @krinart in #9034
  • Remove waiting for runtime to be ready before creating snapshot by @krinart in #9033
  • Fix snapshot on_change policy to skip when no writes occurred by @sgrebnov in #9028
  • Release notes for release release/1.11.0-rc.2 by @krinart in #9016
  • ci: use arduino/setup-protoc for official protobuf compiler by @phillipleblanc in #9036
  • ci: install unzip on aarch64 runner for arduino/setup-protoc by @phillipleblanc in #9038
  • fix: don't fail release if upload to minio fails by @phillipleblanc in #9039
  • Add missing protoc step to setup-cc action by @krinart in #9041
  • fix: Update Search integration test snapshots by @app/github-actions in #9013
  • Fix formula_1 and codebase_community in bird-bench by @Jeadie in #9000
  • Cayenne S3 Express One Zone improvements by @lukekim in #9015
  • Add zlib1g-dev to CI by @lukekim in #9052
  • Improve validation and logging for hash indexes by @lukekim in #9047
  • Upgrade Vortex with CASE-WHEN by @lukekim in #9051
  • x.ai models now exclusively use /v1/responses endpoint by @lukekim in #9400
  • Improvements for snapshot schema comparison by @krinart in #9401
  • v2.0 breaking changes by @lukekim in #9233
  • Create PartitionManagementTask for scheduler to update accelerated table partition assignments by @Jeadie in #9378
  • refactor(Cayenne): route all write orchestration through CayenneDataSink by @sgrebnov in #9402
  • Refactor benchmark to use QueryExecutor trait by @Jeadie in #9418
  • feat: Add spidapter build and release workflow by @peasee in #9427
  • Testoperator: add support for api-key when connecting to external spice instance by @sgrebnov in #9421
  • Initial implementation of Ducklake catalog & data connectors by @lukekim in #9083
  • Require aws_lc_rs since jsonwebtoken upgrade by @Jeadie in #9426
  • feat: Add spidapter tool by @peasee in #9425
  • Add release notes for 1.11.2 patch release by @sgrebnov in #9430
  • feat(spidapter): integrate system-adapter-protocol with SCP provisioning by @phillipleblanc in #9434
  • Add DuckLake TPCH E2E workflow and federated Spicepod configuration by @lukekim in #9431
  • fix(spidapter): use Flight handshake auth instead of x-api-key header by @phillipleblanc in #9435
  • [spidapter] Keep only what sparks joy by @Jeadie in #9439
  • Refactor binary operator balancing by @Jeadie in #9424
  • feat: Add Iceberg DDL support (CREATE TABLE / DROP TABLE) for default catalog override by @phillipleblanc in #9440
  • Fix Flight SQL schema consistency: expand view types and verify field names by @sgrebnov in #9438
  • Update spidapter for new system-adapter-protocol by @sgrebnov in #9442
  • docs: fix typos and syntax errors in style guide and error handling docs by @cluster2600 in #9445
  • Add acceleration refresh ingestion metrics (rows_written, bytes_written) by @phillipleblanc in #9461
  • Refactor(Cayenne): Replace CatalogError and string based errors with Snafu errors by @sgrebnov in #9403
  • Replace deprecated claude-3-5-haiku-latest with claude-haiku-4-5 by @Jeadie in #9492
  • Fix #9481: Preserve schema in results cache for empty query results by @phillipleblanc in #9485
  • Fix partition by serializing by @Jeadie in #9474
  • query: reconcile execution stream nullability with logical plan schema by @phillipleblanc in #9486
  • initial spice-cloud-client crate and spice cloud metrics --app <app-name>. by @Jeadie in #9480
  • feat: Return dataset error message in datasets API by @peasee in #9487
  • Spicebench by @lukekim in #9447
  • build(deps): consolidate dependabot dependency updates by @phillipleblanc in #9504
  • fix(cluster): route non-partitioned accelerated tables in distributed mode by @phillipleblanc in #9508
  • Enable core scalar UDFs in refresh SQL by @sgrebnov in #9502
  • Fix metrics in Spidapter again by @Jeadie in #9497
  • fix(cluster): tolerate Completed->status propagation race in distributed query handle by @phillipleblanc in #9510
  • feat: Support distributed ingestion in cayenne catalog by @peasee in #9506
  • Fix Cayenne duplicate primary keys after DELETE + UPSERT CDC sequences by @krinart in #9494
  • fix(cluster): rewrite table scans inside subqueries for distributed execution by @phillipleblanc in #9518
  • fix: Set catalog mode to readwritecreate in spidapter by @peasee in #9519
  • Upgrade AWS SDK crates & set APN user-agent in AWS SDK credential bridge by @lukekim in #8328
  • feat(runtime): add runtime ready_state on_registration semantics by @lukekim in #9522
  • fix: Add spidapter post-setup retries by @peasee in #9526
  • Make partition discovery more robust and make initialization non-blocking by @sgrebnov in #9499
  • Make lint-rust-fix support targeted packages and features by @Jeadie in #9511
  • Handle new Cloud SCP API by @Jeadie in #9532
  • Refactor and simplify streaming benchmarks by @krinart in #9405
  • fix: ensure spidapter only increments attempts on failures by @peasee in #9534
  • feat: Support specifying app resources in spidapter by @peasee in #9536
  • test(runtime): Spice Cayenne DDL integration test by @lukekim in #9535
  • fix: Handle schema evolution mismatch errors during data refresh by @lukekim in #9527
  • fix: resolve clippy lint warnings by @phillipleblanc in #9547
  • pr-builds --tag <TAG> for build_and_release.yml by @Jeadie in #9507
  • Add --output flag to spice login with env/json/keychain modes by @Jeadie in #9541
  • Don't use 'PartitionedTableScanRewrite' in async distributed query by @Jeadie in #9548
  • feat(spidapter): add local backend mode with single executor by @phillipleblanc in #9531
  • support chat template in HF by @Jeadie in #9543
  • fix(cayenne): stream PK retention deletes and run OOM regression in CI by @phillipleblanc in #9533
  • cayenne: Staged append writes to prevent partial writes and data loss on stream error by @sgrebnov in #9491
  • AcceleratedTable::scan use FederatedTable::scan when ClusterRole::Scheduler by @Jeadie in #9550
  • Upgrade to delta-kernel-rs v0.18.2 by @lukekim in #9528
  • Run cayenne tests as part of PR CI by @sgrebnov in #9554
  • Upgrade to DataFusion v52.2.0 by @lukekim in #9419
  • Remove Snapshot Compaction + Add snapshot existence check by @krinart in #9523
  • Update dependencies by @lukekim in #9566
  • fix: Update benchmark snapshots by @app/github-actions in #9565
  • fix: Compare Cayenne table configuration on startup by @peasee in #9529
  • Make Refresh::refresh_sql more robust to alterations over time. by @Jeadie in #9549
  • fix: Update datafusion-table-providers dependency to latest revision by @lukekim in #9574
  • Unset AWS_ENDPOINT_URL when empty by @krinart in #9575
  • fix: allow BytesProcessedExec repartitioning for unordered input by @lukekim in #9540
  • Sanitize DataFusion errors by @lukekim in #9530
  • Add conditional logging for partition assignments by @Jeadie in #9577
  • use 'properly early exit on SIGTERM' by @Jeadie in #9573
  • Update datafusion to 52.2.0 by @phillipleblanc in #9582
  • Ensure we query one and only one partition per request by @Jeadie in #9416
  • feat: Add support for Spicepod version v2 by @lukekim in #9583
  • [SpiceDQ] Improve error messages; Avoid race condition on allocate_initial_partitions. by @Jeadie in #9579
  • Update ballista dependencies to latest 52.0.0 revision by @lukekim in #9581
  • Fix Databricks spark_connect mode always disabled by @phillipleblanc in #9586
  • Support partitioning in Arrow accelerator by @Jeadie in #9571
  • Fix spice query CLI response deserialization by @phillipleblanc in #9588
  • fix: Update benchmark snapshots by @app/github-actions in #9584
  • fix: Share RuntimeEnv across Cayenne read/write/delete paths for targeted list_files_cache invalidation by @sgrebnov in #9589
  • feat: Add file:// state_location support for async queries scheduler by @phillipleblanc in #9590
  • Update endgame links by @krinart in #9598
  • ci: fix E2E CLI upgrade test to use latest release for spiced download by @phillipleblanc in #9613
  • fix(DF): Lazily initialize BatchCoalescer in RepartitionExec to avoid schema type mismatch by @sgrebnov in #9623
  • feat: Implement catalog connectors for various databases by @lukekim in #9509
  • Refactor and clean up code across multiple crates by @lukekim in #9620
  • fix: Improve error handling for distributed mode and state_location configuration by @lukekim in #9611
  • Properly install postgres in install-postgres action by @krinart in #9629
  • fix: Use Python venv for schema validation in CI by @phillipleblanc in #9637
  • Update spicepod.schema.json by @app/github-actions in #9640
  • Update testoperator dispatch to use release/2.0 branch by @phillipleblanc in #9641
  • fix: Align CUDA asset names in Dockerfile and install tests with build output by @phillipleblanc in #9639
  • Fix expect test scripts in E2E Installation AI test by @sgrebnov in #9643
  • testoperator for partitioned arrow accelerator by @Jeadie in #9635
  • Remove default 1s refresh_check_interval from spidapter for hive datasets by @phillipleblanc in #9645
  • Fix scheduler panic and cancel race condition by @phillipleblanc in #9644
  • Align Spice.ai connector parameter names across catalog/data connectors by @lukekim in #9632
  • docs: update distribution details and add NAS support in release notes by @lukekim in #9650
  • Enable postgres-accel in CI builds for benchmarks by @sgrebnov in #9649
  • perf: Cache Turso metastore connection across operations by @penberg in #9646
  • Add 'scheduler_state_location' to spidapter by @Jeadie in #9655
  • Implement Cayenne S3 Express multi-zone live test with data validation by @lukekim in #9631
  • chore(spidapter): bump default memory limit from 8Gi to 32Gi by @phillipleblanc in #9661
  • perf: Use prepare_cached() in Turso and SQLite metastore backends by @penberg in #9662
  • Improve CDC cache invalidation by @krinart in #9651
  • Refactor Cayenne IDs to use UUIDv7 strings by @lukekim in #9667
  • fix: add liveness check for dead executors in partition routing by @Jeadie in #9657
  • fix(s3): Fix metadata column schema mismatches in projected queries by @sgrebnov in #9664
  • s3_metadata_columns tests: include test for location outside table prefix by @sgrebnov in #9676
  • docs: Update DuckDB, GCS, Git connector and Cayenne documentation by @lukekim in #9671
  • Add s3_url_style support for S3 connector URL addressing by @phillipleblanc in #9642
  • Consolidate E2E workflows and require WSL for Windows runtime by @lukekim in #9660
  • Upgrade to Rust v1.93.1 by @lukekim in #9669
  • Security fixes and improvements by @lukekim in #9666
  • feat(flight): add DoPut rows/bytes written metrics for DoPut ETL ingestion tracking by @phillipleblanc in #9663
  • Skip caching http error response + add response_headers by @krinart in #9670
  • refactor: Remove v1/evals functionality by @Jeadie in #9420
  • Make a test harness for Distributed Spice integration tests by @Jeadie in #9615
  • Enable on_zero_results: use_source for views by @krinart in #9699
  • fix(spidapter): Lower memory limit, passthrough AWS secrets, override flight URL by @peasee in #9704
  • Show an error on a shared acceleration file with snapshots enabled by @krinart in #9698
  • Fixes for anthropic by @Jeadie in #9707
  • Use max_partitions_per_executor in allocate_initial_partitions by @Jeadie in #9659
  • [SpiceDQ] Accelerations must have partition key by @Jeadie in #9711
  • Upgrade to Turso v0.5 by @lukekim in #9628
  • feat: Rename metadata columns to _location, _last_modified, _size by @phillipleblanc in #9712
  • fix: bump datafusion-ballista to fix BatchCoalescer schema mismatch panic by @phillipleblanc in #9716
  • fix: Ensure Cayenne respects target file size by @peasee in #9730
  • refactor: Make DDL preprocessing generic from Iceberg DDL processing by @peasee in #9731
  • [SpiceDQ] Distribute query of Cayenne Catalog to executors with data by @Jeadie in #9727
  • Properly set primary_keys/on_conflict for Cayenne tables by @krinart in #9739
  • Add executor resource and replica support to cloud app config by @ewgenius in #9734
  • feat: Support PARTITION BY in Cayenne Catalog table creation by @peasee in #9741
  • Update datafusion and related packages to version 52.3.0 by @lukekim in #9708
  • Route FlightSQL statement updates through QueryBuilder by @phillipleblanc in #9754
  • JSON file format improvements by @lukekim in #9743
  • [SpiceDQ] Partition Cayenne catalogs writes through to executors by @Jeadie in #9737
  • Update to DF v52.3.0 versions of datafusion & datafusion-tableproviders by @lukekim in #9756
  • Make S3 metadata column handling more robust by @sgrebnov in #9762
  • Fetch API keys from dedicated endpoint instead of apps response by @phillipleblanc in #9767
  • Update arrow-rs, datafusion-federation, and datafusion-table-providers dependencies by @phillipleblanc in #9769
  • Chunk metastore batch inserts to respect SQLite parameter limits by @phillipleblanc in #9770
  • Improve JSON SODA support by @lukekim in #9795
  • Add ADBC Data Connector by @lukekim in #9723
  • docs: Release Cayenne as RC by @peasee in #9766
  • cli[feat]: cloud mode to use region-specific endpoints by @lukekim in #9803
  • Include updated JSON formats in HTTPS connector by @lukekim in #9800
  • Flight DoPut: Partition-aware write-through forwarding by @Jeadie in #9759
  • Pass through authentication to ADBC connector by @lukekim in #9801
  • Move scheduler_state_location from adapter metadata to env var by @phillipleblanc in #9802
  • Fix Cayenne DoPut upsert returning stale data after 3+ writes by @phillipleblanc in #9806
  • Fix JSON column projection producing schema mismatch by @sgrebnov in #9811
  • Fix http connector by @krinart in #9818
  • Fix ADBC Connector build and test by @lukekim in #9813
  • Support update & delete DML for distributed cayenne catalog by @Jeadie in #9805
  • Set allow_http param when S3 endpoint uses http scheme by @phillipleblanc in #9834
  • fix: Cayenne Catalog DDL requires a connected executor in distributed mode by @Jeadie in #9838
  • fix: Add conditional put support for file:// scheduler state location by @Jeadie in #9842
  • fix: Require the DDL primary key contain the partition key by @Jeadie in #9844
  • fix: Databricks SQL Warehouse schema retrieval with INLINE disposition and async retry by @lukekim in #9846
  • Filter pushdown improvements for SqlTable by @lukekim in #9852
  • feat: add iam_role_source parameter for AWS credential configuration by @lukekim in #9854
  • Fix ODBC queries silently returning 0 rows on query failure by @lukekim in #9864
  • feat(adbc): Add ADBC catalog connector with schema/table discovery by @lukekim in #9865
  • Make Turso SQL unparsing more robust and fix date comparisons by @lukekim in #9871
  • Fix Flight/FlightSQL filter precedence and mutable query consistency by @lukekim in #9876
  • Partial Aggregation optimisation for FlightSQLExec by @lukekim in #9882
  • fix: v1/responses API preserves client instructions when system_prompt is set by @Jeadie in #9884
  • feat: emit scheduler_active_executors_count and use it in spidapter by @Jeadie in #9885
  • feat: Add custom auth header support for GraphQL connector by @krinart in #9899
  • Add --endpoint flag to spice run with scheme-based routing by @lukekim in #9903
  • When executor connects, send DDL for existing tables by @Jeadie in #9904
  • fix: Improve ADBC driver shutdown handling and error classification by @lukekim in #9905
  • fix: require all executors to succeed for distributed DML (DELETE/UPDATE) forwarding by @Jeadie in #9908
  • fix(cayenne catalog): fix catalog refresh race condition causing duplicate primary keys by @Jeadie in #9909
  • Remove Perplexity support by @Jeadie in #9910
  • Fix refresh_sql support for debezium constraints by @krinart in #9912
  • Implement DML for DynamoDBTableProvider by @lukekim in #9915
  • chore: Update iceberg-rust fork to v0.9 by @lukekim in #9917
  • Run physical optimizer on FallbackOnZeroResultsScanExec fallback plan by @sgrebnov in #9927
  • Improve Databricks error message when dataset has no columns by @sgrebnov in #9928
  • Delta Lake: fix data skipping for >= timestamp predicates by @sgrebnov in #9932
  • fix: Ensure distributed Cayenne DML inserts are forwarded to executors by @Jeadie in #9948
  • Add full query federation support for ADBC data connector by @lukekim in #9953
  • Make time_format deserialization case-insensitive by @claudespice in #9955
  • Hash ADBC join-pushdown context to prevent credential leaks in EXPLAIN plans by @lukekim in #9956
  • fix: Normalize Arrow Dictionary types for DuckDB and SQLite acceleration by @sgrebnov in #9959
  • ADBC BigQuery: Improve BigQuery dialect date/time and interval SQL generation by @lukekim in #9967
  • Make BigQueryDialect more robust and add BigQuery TPC-H benchmark support by @lukekim in #9969
  • fix: Show proper unauthorized error instead of misleading runtime unavailable by @lukekim in #9972
  • fix: Enforce target_chunk_size as hard maximum in chunking by @lukekim in #9973
  • Add caching retention by @krinart in #9984
  • fix: improve Databricks schema error detection and messages by @lukekim in #9987
  • fix: Set default S3 region for opendal operator and fix cayenne nextest by @phillipleblanc in #9995
  • fix(PostgreSQL): fix schema discovery for PostgreSQL partitioned tables by @sgrebnov in #9997
  • fix: Defer cache size check until after encoding for compressed results by @krinart in #10001
  • fix: Rewrite numeric BETWEEN to CAST(AS REAL) for Turso by @lukekim in #10003
  • fix: Handle integer time columns in append refresh for all accelerators by @sgrebnov in #10004
  • fix: preserve s3a:// scheme when building OpenDalStorageFactory with custom endpoint by @phillipleblanc in #10006
  • Fix ISO8601 time_format with Vortex/Cayenne append refresh by @sgrebnov in #10009
  • fix: Address data correctness bugs found in audit by @sgrebnov in #10015
  • fix(federation): fix SQL unparsing for Inexact filter pushdown with alias by @lukekim in #10017
  • Improve GitHub connector ref handling and resilience by @lukekim in #10023
  • feat: Add spice completions command for shell completion generation by @lukekim in #10024
  • fix: Fix data correctness bugs in DynamoDB decimal conversion and GraphQL pagination by @sgrebnov in #10054
  • Implement RefreshDataset for distributed control stream by @Jeadie in #10055
  • perf: Improve S3 parquet read performance by @sgrebnov in #10064
  • fix: Prevent write-through stalls and preserve PartitionTableProvider during catalog refresh by @Jeadie in #10066
  • feat: spice completions auto-detects shell directory and writes file by @lukekim in #10068
  • fix: Bug in DynamoDB, GraphQL, and ISO8601 refresh data handling by @sgrebnov in #10063
  • fix partial aggregation deduplication on string checking by @lukekim in #10078
  • fix: add MetastoreTransaction support to prevent concurrent transaction conflicts by @phillipleblanc in #10080
  • fix: Use GreedyMemoryPool, add spidapter query memory limit arg by @phillipleblanc in #10082
  • feat: Add metrics for EXPLAIN ANALYZE in FlightSQLExec by @lukekim in #10084
  • Use strict cast in try_cast_to to error on overflow instead of silent NULL by @sgrebnov in #10104
  • feat: Implement MERGE INTO for Cayenne catalog tables by @peasee in #10105
  • feat: Add distributed MERGE INTO support for Cayenne catalog tables by @peasee in #10106
  • Improve JSON format auto-detection for single multi-line objects by @lukekim in #10107
  • Add mode: file_update acceleration mode by @krinart in #10108
  • Coerce unsupported Arrow types to Iceberg v2 equivalents in REST catalog API by @peasee in #10109
  • fix: Update default query memory limit to 90% from 70% by @phillipleblanc in #10112
  • feat: Add mTLS client auth support to spice sql REPL by @lukekim in #10113
  • fix(datafusion-federation): report error on overflow instead of silent NULL by @sgrebnov in #10124
  • fix: Prevent data loss in MERGE when source has duplicate keys by @peasee in #10126
  • feat: Add ClickHouse Date32 type support by @sgrebnov in #10132
  • Add Delta Lake column mapping support (Name/Id modes) by @sgrebnov in #10134
  • fix: Restore Turso numeric BETWEEN rewrite lost in DML revert by @lukekim in #10139
  • fix: Enable arm64 Linux builds with fp16 and lld workarounds by @lukekim in #10142
  • fix: remove double trailing slash in Unity Catalog storage locations by @sgrebnov in #10147
  • fix: Improve GitHub GraphQL client resilience and performance by @lukekim in #10151
  • Enable reqwest compression and optimize HTTP client settings by @lukekim in #10154
  • fix: executor startup failures by @Jeadie in #10155
  • feat: Distributed runtime.task_history support by @Jeadie in #10156
  • fix: Preserve timestamp timezone in DDL forwarding to executors by @peasee in #10159
  • feat: Per-model rate-limited concurrent AI UDF execution by @Jeadie in #10160
  • fix(Turso): Reject subquery/outer-ref filter pushdown in Turso provider by @lukekim in #10174
  • Fix linux/macos spice upgrade by @phillipleblanc in #10194
  • Improve CREATE TABLE LIKE error messages, success output, EXPLAIN, and validation by @peasee in #10203
  • fix: chunk MERGE delete filters and update Vortex for stack-safe IN-lists by @peasee in #10207
  • Propagate runtime.params.parquet_page_index to Delta Lake connector by @sgrebnov in #10209
  • Properly mark dataset as Ready on Scheduler by @Jeadie in #10215
  • fix: handle Utf8View/LargeUtf8 in GitHub connector ref filters by @lukekim in #10217
  • fix(databricks): Fix schema introspection and timestamp overflow by @lukekim in #10226
  • fix(databricks): Fix schema introspection failures for non-Unity-Catalog environments by @lukekim in #10227
  • feat: Add pagination support to HTTP data connector by @lukekim in #10228
  • feat(databricks): DESCRIBE TABLE fallback and source-native type parsing for Lakehouse Federation by @lukekim in #10229
  • fix(databricks): harden HTTP retries, compression, and token refresh by @lukekim in #10232
  • feat[helm chart]: Add support for ServiceAccount annotations and AWS IRSA example by @peasee in #9833
  • fix: Log warning and fall back gracefully on Cayenne config change by @krinart in #9092
  • fix: Handle engine mismatch gracefully in snapshot fallback loop by @krinart in #9187
  • fix: Full Text Search schema mismatch with ADBC connector by @lukekim in #10235
  • docs: Update v2.0.0-rc.2 release notes with latest changes by @lukekim in #10238
  • Fix append refresh dedup failure when refresh_sql selects column subset by @sgrebnov in #10225
  • Revert "Properly mark dataset as Ready on Scheduler (#10215)" by @sgrebnov in #10242
  • Fix failing merge conflicts for benchmarks by @krinart in #10247
  • fix(github): fetch commits for dynamic and slash refs by @lukekim in #10233
  • Upgrade DataFusion to v52.5.0-rc1 by @lukekim in #10249
  • Merge develop to trunk (2026-04-09) by @claudespice in #10248
  • fix: Validate embedding row_id columns during dataset init (fixes #8226) by @claudespice in #10208
  • fix: Update tpch benchmark snapshots for federated/glue[csv].yaml by @app/github-actions in #10244
  • feat(databricks): add resilience controls, UC awareness, and task history instrumentation by @lukekim in #10246
  • fix: Make PartitionManager resilient to bare vs fully qualified table references by @sgrebnov in #10257
  • fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-cayenne[file].yaml by @app/github-actions in #10256
  • Merge develop to trunk (2026-04-10) by @claudespice in #10251
  • Improve Snowflake/ADBC dataset registration performance and observability by @lukekim in #10266
  • Fixes for kafka connector by @krinart in #10263
  • fix(runtime): gate otel code tags, suppress aws sdk noise, and unblock connector init by @lukekim in #10260
  • fix(runtime): avoid regionless AWS SDK loads by @lukekim in #10271
  • Add versioned release install workflow coverage by @lukekim in #10276
  • fix(runtime): handle HTTP JSON unions and spicepod reloads by @lukekim in #10277
  • Databricks UC permission prechecks: explicit denial as permanent error, ambiguous cases advisory by @lukekim in #10274
  • Revert component status changes re-introduced by develop merge (#10248) by @sgrebnov in #10293
  • Fix broken CI workflows by @ewgenius in #10294
  • Group dependabot updates by ecosystem by @lukekim in #10296
  • fix(tests): Replace flaky S3 Vectors snapshot tests with structural validation by @lukekim in #10301
  • Update test_github_workflows snapshot by @lukekim in #10304
  • fix(ci): fix Bedrock runner mismatch and snapshot auto-merge failure by @ewgenius in #10306
  • feat(http): Add map-to-array conversion and query-parameter pagination by @lukekim in #10295
  • New crate: datafusion-ddl by @Jeadie in #10205
  • Make Databricks UC permission checks advisory with structured error reporting by @lukekim in #10283
  • build(deps): bump the github-actions-dependencies group with 4 updates by @app/dependabot in #10298
  • fix: Clear cached plans on view updates by @peasee in #10312
  • build(deps): bump the aws-sdk group with 7 updates by @app/dependabot in #10299
  • Code out of runtime. by @Jeadie in #10178
  • fix: Respect function registry denies for accelerated table filter pushdown by @peasee in #10311
  • fix: Don't block heartbeat when all slots acquired by @peasee in #10322
  • fix: strip only outer parens in get_table_partition_expr_from_ctx by @Jeadie in #10323
  • Upgrade datafusion-table-providers with MongoDB SRV support by @lukekim in #10317
  • fix: Avoid pushing down bucketing partition expressions into executors by @peasee in #10324
  • Upgrade datafusion-table-providers to d1b911a5 and bump adbc to 0.23 by @lukekim in #10329
  • fix: Update Search integration test snapshots by @app/github-actions in #10308
  • Handle foreign table + Classic sql warehouse combination gracefully by @krinart in #10318
  • New crate datafusion-flightsql by @Jeadie in #10201
  • Set tantivy=warn unless very verbose logging by @Jeadie in #10338
  • Remove image registry and image name options from spidapter by @ewgenius in #10241
  • build(deps): bump sysinfo from 0.37.2 to 0.38.4 by @app/dependabot in #10291
  • build(deps): bump futures from 0.3.31 to 0.3.32 by @app/dependabot in #10289
  • New crate 'datafusion-dml' by @Jeadie in #10334
  • Jeadie/26 04 16/spice sql by @Jeadie in #10343
  • Add Teraswitch/Pittsburgh apt mirrors + retry config for CI runners by @lukekim in #10349
  • Implement sort pushdown and fix pushdown gaps across providers by @lukekim in #10337
  • Merge develop to trunk (2026-04-16) by @claudespice in #10345
  • Update candle and mistral.rs lock-step pins by @lukekim in #10278
  • docs: fix status badges in README by @lukekim in #10350
  • Migrate secrets to vars by @krinart in #10354
  • Add limit pushdown and improve sort pushdown for Oracle and MSSQL by @sgrebnov in #10351
  • Fix ubuntu mirror configuration by @ewgenius in #10359
  • fix: Increase throughput test default ready_wait from 30s to 300s (fixes #8207) by @claudespice in #10344
  • Add auth headers support to OTEL metrics exporter by @lukekim in #10347
  • fix(github): shrink GraphQL page size on gateway errors; lower comment defaults by @lukekim in #10355
  • Relax apt mirror substitution failure to warning in CI action by @ewgenius in #10361
  • feat(http): Add OAuth2 refresh-token auth to HTTP connector by @lukekim in #10348
  • Upgrade Rust toolchain to 1.94.1 by @lukekim in #10353
  • Handle order by and sort in PartitionedTableScanRewrite by @Jeadie in #9656
  • Fix OTEL Exporter by @krinart in #10363
  • Pin spiceai candle / TEI forks to merged revs; drop local [patch] overrides by @lukekim in #10362
  • Integrate spiceio and makefile_targets into pr.yml by @lukekim in #10357
  • ci: skip artifact compression for test binaries/archives by @lukekim in #10381
  • chore(deps): bump spiceai/candle, spiceai/mistral.rs, aws-lc-rs, tantivy, rand by @lukekim in #10379
  • Bump datafusion-table-providers (#10375) by @lukekim in #10384
  • fix: Update Search integration test snapshots by @app/github-actions in #10376
  • v2.0.0-rc.3 preparation by @ewgenius in #10382
  • fix(spicepod): JSON schema accepts string or {name: expr} for partition_by by @lukekim in #10352
  • fix: Use ROUND for Turso decimal BETWEEN comparisons (fixes #9872) by @claudespice in #10360
  • Revert "v2.0.0-rc.3 preparation" from trunk by @ewgenius in #10386
  • Add on_schema_resolved dataset ready state by @lukekim in #10368
  • feat: Add Elasticsearch data connector with hybrid search support by @lukekim in #10258
  • ci: bump test archive upload compression-level to 1 by @lukekim in #10388
  • feat(git-connector): promote Git connector to RC status by @lukekim in #10385
  • feat(postgres): stream WAL directly to Spice accelerators by @lukekim in #10364
  • Add schema decomposition to the HTTP connector by @lukekim in #10393
  • fix(cayenne): Skip catalog refresh state reload for existing providers by @sgrebnov in #10396
  • Make cayenne-flightsql tool by @Jeadie in #10356
  • build(deps): bump the github-actions-dependencies group with 2 updates by @app/dependabot in #10398
  • Update openapi.json by @app/github-actions in #10272
  • Merge develop to trunk โ€” 2026-04-19 by @claudespice in #10407
  • feat(otel): default OTLP push exporter to delta temporality by @phillipleblanc in #10412
  • fix: Restore analyzer rule ordering to run federation before type coercion by @sgrebnov in #10415
  • fix: Map Utf8/LargeUtf8 to STRING in Databricks/Spark SQL dialects by @sgrebnov in #10420
  • feat(otel): add metric name prefix at runtime.telemetry.metric_prefix by @phillipleblanc in #10418
  • fix: Map LargeUtf8 to VARCHAR in Athena ODBC dialect by @sgrebnov in #10419
  • feat(cluster): connector-driven object store registration on executors by @phillipleblanc in #10414
  • build(deps): bump ubuntu from 22.04 to 24.04 in the docker-dependencies group by @app/dependabot in #10397
  • fix: Update benchmark snapshots Apr 20 by @app/github-actions in #10417
  • feat(otel): apply runtime.telemetry.properties as resource attributes on exported metrics by @phillipleblanc in #10416
  • Publish RC releases to DockerHub; upgrade runners to ubuntu-24.04 by @lukekim in #10428
  • feat: Add Azure Cosmos DB (NoSQL) data connector (RC) by @lukekim in #10392
  • feat(datafusion): flatten_json_properties + json_tree UDTFs by @lukekim in #10406
  • Harden /v1/tools and /v1/nsql against unauthenticated / LLM-driven SQL by @lukekim in #10365
  • feat(embeddings): multi-vector embeddings with MaxSim + late-interaction by @lukekim in #10408
  • Update GH runners for CUDA builds by @ewgenius in #10432
  • fix(delta_lake): register object stores on cluster executors by @phillipleblanc in #10436
  • DF-native DML by @krinart in #10327
  • ci: run Build and Test on spiceai-macos; split install jobs by profile by @lukekim in #10434
  • Improve search UDTFs: text_search, vector_search, rrf by @lukekim in #10387
  • fix(model2vec): Improve robustness of model loading for sentence-transformers layouts by @sgrebnov in #10444
  • Merge develop to trunk โ€” 2026-04-21 by @claudespice in #10448
  • Enable filter pushdown for vector_search UDTF by @sgrebnov in #10447
  • Support Snowflake OBJECT, MAP, GEOGRAPHY, GEOMETRY, VECTOR, TIMESTAMP_LTZ types by @lukekim in #10451
  • Fix Databricks tests by @krinart in #10449
  • fix(cluster): forward register_object_stores through connector wrappers by @phillipleblanc in #10460
  • Fixes for vector-search by @krinart in #10455
  • Add expand_maps option and flatten_json UDTF by @lukekim in #10452
  • fix: Update Search integration test snapshots by @app/github-actions in #10458
  • Fix physical codec decode ambiguity for empty protobuf messages by @sgrebnov in #10466
  • chore(logging): demote s3_single_file_cached skip refresh log to debug by @phillipleblanc in #10467
  • Enable filter pushdown for rrf UDTF by @sgrebnov in #10465
  • feat(cluster): consolidate distributed state into cluster.json by @phillipleblanc in #10463
  • feat(cayenne): Add column statistics and data inlining by @lukekim in #10314
  • docs(copilot): flag missing wrapper delegation when adding default trait methods by @phillipleblanc in #10461
  • Wire Elasticsearch vector engine write path through acceleration by @lukekim in #10453
  • Add helm lint CI by @ewgenius in #10468
  • Fix Azure and GCS acceleration snapshot object store credential handling by @phillipleblanc in #10486
  • Update spicepod.schema.json by @app/github-actions in #10485
  • fix(secrets): harden AWS Secrets Manager secret store by @lukekim in #10478
  • Update datafusion-ballista crate by @sgrebnov in #10488
  • feat(secrets): add ParameterSpec and more params for AWS secrets manager by @phillipleblanc in #10487
  • Add rerank UDTF for hybrid search with query auto-propagation by @lukekim in #10469
  • Fix flatten_json_properties by @krinart in #10475
  • fix: preserve field and schema metadata in expand_views_schema by @claudespice in #10494
  • Upgrade rmcp to upstream 1.5.0; switch MCP server to Streamable HTTP by @lukekim in #10491
  • fix: handle Snowflake TIMESTAMP_LTZ wire format and prevent nanosecond overflow by @claudespice in #10493
  • Lint parity in Makefile by @krinart in #10492
  • Add connect_timeout/client_timeout params to Databricks sql_warehouse mode by @lukekim in #10495
  • fix(tracing): suppress opentelemetry INFO logs at all verbosity levels by @lukekim in #10497
  • DynamoDB DML by @krinart in #10470
  • feat(cayenne): native vector search via SIMD similarity UDFs by @lukekim in #10456
  • fix(cli): suppress banner for all JSON-producing cloud subcommands (fixes #10498) by @claudespice in #10510
  • fix(deps): bump openssl to 0.10.78 by @phillipleblanc in #10509
  • fix(s3): quiet AWS SDK credential probe when no region is configured by @phillipleblanc in #10506
  • fix(cdc): emit ready signal on caught-up Kafka/Debezium streams (#5201) by @phillipleblanc in #10504
  • runtime-cluster crate + Run partition discovery before forwarding refresh to executors by @krinart in #10490
  • Update lint-rust target to use --keep-going by @Jeadie in #10508
  • Add TPC-H SF100 s3[parquet]-duckdb[file] benchmark spicepod by @lukekim in #10524
  • Remove dev-profile install steps from pr.yml by @Jeadie in #10507
  • fix: add missing NULL check on Timestamp path in append refresh by @claudespice in #10518
  • fix: return error on Decimal128/256 overflow instead of silently dropping scale by @claudespice in #10519
  • fix: delegate update and delete_from in IndexedTableProvider and EmbeddingTable by @claudespice in #10520
  • feat(devx): make config errors, CLI, and REPL lead users to success by @lukekim in #10489
  • fix(rerank): defer execution to RerankExec, enable filters and projection pushdown by @sgrebnov in #10514
  • fix(llms): support Gemma models with missing attention_bias config field by @lukekim in #10523
  • Fix vector_search silently ignoring named limit/column/include_score args by @sgrebnov in #10527
  • fix: split unsupported filters locally in scan() for UseSource mode by @ewgenius in #10528
  • feat(secrets): add Azure Key Vault secret store by @lukekim in #10496
  • Bump mistralrs by @krinart in #10532
  • Fix benchmark configurations and CI build issues by @sgrebnov in #10535
  • Fix catalog query overrides for MySQL and MSSQL benchmarks by @sgrebnov in #10543
  • For Cayenne, preserve matched columns for MERGE ... ON <cols> by @Jeadie in #10340
  • build(deps): bump the aws-sdk group across 1 directory with 5 updates by @app/dependabot in #10538
  • docs: update AI agent instructions (git workflow + Rust 1.94) by @lukekim in #10544
  • fix: Update tpch benchmark snapshots by @app/github-actions in #10529
  • fix: Update tpch benchmark snapshots for accelerated/s3[parquet]-duckdb[file].yaml by @app/github-actions in #10525
  • Extract runtime-datafusion from runtime by @krinart in #10545
  • Use generic DML extension planner for Cayenne by @Jeadie in #10437
  • fix: Update Search integration test snapshots by @app/github-actions in #10552
  • Fix security and correctness audit issues by @lukekim in #10526
  • fix(MySQL): revert MySQL result column reorder to fix federated query failures by @sgrebnov in #10557
  • Fix protoc installation by @krinart in #10566
  • fix: Disable Ballista dynamic filters on HashJoinExec by @peasee in #10548
  • Support views on DDL catalogs by @Jeadie in #10554
  • Update datafusion by @Jeadie in #10422
  • Improve full-text search indexing performance by @sgrebnov in #10464
  • feat(mysql): add mysql_zero_date_behavior parameter (null|error) by @phillipleblanc in #10573
  • fix(snowflake): declare private_key in connector PARAMETERS (fixes #10517) by @claudespice in #10559
  • Honour CARGO_TARGET_DIR in Makefiles by @Jeadie in #10569
  • Enable cosine_distance pushdown to DuckDB accelerator via array_cosine_distance by @sgrebnov in #10564
  • fix: Update test snapshots by @app/github-actions in #10570
  • fix: Update tpch benchmark snapshots by @app/github-actions in #10560
  • feat(snapshots): make snapshots an optional feature by @phillipleblanc in #10574
  • Enforce read-only API key restrictions on Flight DoGet and async query paths by @Jeadie in #10551
  • Improved security posture on Github workflows by @Jeadie in #10556
  • fix: Update datafusion-table-providers to improve SqlTable filter pushdown by @sgrebnov in #10595
  • feat(secrets): add HashiCorp Vault secret store by @phillipleblanc in #10561
  • fix: delegate update() in UpsertDedupTableProvider to inner provider by @claudespice in #10593
  • Add DuckDB vector engine support by @lukekim in #10562
  • Sharepoint - add object-store listing connector with expanded auth and write support by @lukekim in #10473
  • fix: Install protoc from source by @peasee in #10597
  • Enable DML support for PostgreSQL data connector by @phillipleblanc in #10446
  • feat(postgres): support inline PEM sslrootcert by @claudespice in #10578
  • Add foreign key metadata discovery to PostgreSQL Catalog by @sgrebnov in #10849
  • Add Snowflake DML support by @lukekim in #10747
  • Add MongoDB Change Streams support by @lukekim in #10813
  • Add user-defined functions by @lukekim in #10571
  • Add table user functions and gate HTTP servers by @lukekim in #10675
  • feat: add on-demand dataset loading by @phillipleblanc in #10629
  • feat(runtime): declared-schema deferred datasets by @phillipleblanc in #10669
  • feat(spicepod, runtime): add columns[].type / nullable + lenient type parser by @phillipleblanc in #10661
  • Replace external smb crate with internal SMB 3.1.1 client by @phillipleblanc in #10516
  • Add unified query cancellation across all paths by @lukekim in #10390
  • Add dynamic HTTP request headers by @lukekim in #10604
  • feat(http): Support dynamic HTTP connector request params from subqueries by @lukekim in #10636
  • feat(http): pass through HTTP metadata columns with JSON schema decomposition by @lukekim in #10679
  • Add nolimit HTTP pagination max pages by @lukekim in #10673
  • Add shared HTTP rate control for connectors by @lukekim in #10648
  • Use origin label instead of name for HTTP rate control metrics by @lukekim in #10689
  • fix(http): reject OR across different HTTP filter columns by @lukekim in #10625
  • Add provider-aware LLM prompt caching by @lukekim in #10645
  • Add searchable registry mode for LLM tools by @lukekim in #10647
  • feat: refresh_mode: snapshot + SQLite/Turso WAL flush + Cayenne metastore slice by @phillipleblanc in #10651
  • feat: per-principal cache namespacing for SQL/search/caching-accelerator by @lukekim in #10702
  • Add self-hosted Spice connector support by @phillipleblanc in #10546
  • Add Delta Lake Azure tenant parameter by @phillipleblanc in #10671
  • Support OAuth2 client credentials in 'spice cloud login' by @ewgenius in #10586
  • Add configurable allowed_hosts for MCP by @lukekim in #10638
  • fix: make Helm chart probes configurable by @peasee in #10696
  • Strip high-cardinality datasets dim from anonymous telemetry by @lukekim in #10711
  • feat(elasticsearch): direct FTS engine config + index lifecycle and ingestion controls by @lukekim in #10672
  • Add DuckDB HNSW vector index support for accelerated views by @sgrebnov in #10695
  • Rewrite DuckDB vector search SQL to activate HNSW_INDEX_SCAN by @sgrebnov in #10674
  • Fix DuckDB HNSW vector indexes lost after data refresh by @sgrebnov in #10668
  • Fix DuckDB DELETE/UPDATE on full and caching refresh mode datasets by @phillipleblanc in #10632
  • Fix DuckLake connector: downcast, module registration, schema discovery, and S3 credentials by @sgrebnov in #10650
  • Fix federation pushing denied functions inside subqueries to remote engines by @phillipleblanc in #10692
  • fix(caching): honour refresh_on_startup: always in caching mode by @phillipleblanc in #10594
  • fix(iceberg): rebuild storage factory when Hadoop catalog scheme is inferred by @sgrebnov in #10601
  • Pipeline CDC ingestion: overlap source reads with batch apply by @lukekim in #10676
  • fix: add NULL check to CDC primary key extraction by @lukekim in #10684
  • Properly handle nullability during CDC processing by @krinart in #10803
  • Flatten scheduler config and rename partition management โ†’ partition assignment by @lukekim in #10450
  • Improve NSQL UX and harden internal LLM tools by @lukekim in #10715
  • Support Responses API across model providers by @lukekim in #10724
  • Update xAI default model and handle Grok model retirements by @Jeadie in #10723
  • Improve cli table layout by @krinart in #10725
  • TLS cert hot-reload (mTLS plan M1) by @phillipleblanc in #10727
  • Fix DuckLake catalog include filter being ignored by @phillipleblanc in #10738
  • Promote DuckLake Catalog and Data Connector to Beta quality by @sgrebnov in #10743
  • feat(ducklake): Support INSERT on catalog tables with read_write access by @sgrebnov in #10744
  • perf(cdc): coalesce envelopes and overlap commits in apply pipeline by @lukekim in #10745
  • feat: Allow full version tags in spicepod version by @peasee in #10748
  • Add Arrow primary key upserts by @lukekim in #10749
  • fix(snapshot): keep refresh_mode snapshot read-only by @phillipleblanc in #10752
  • feat(tls): public mTLS for HTTP and Flight (channel + identity modes) by @phillipleblanc in #10753
  • perf(cayenne): lock-free deletion caches with bloom-prefiltered probe by @lukekim in #10756
  • fix(security): close API key timing-position leak and remote-UDF SSRF by @lukekim in #10757
  • Fix 'wait_until_dependent_tables_are_ready' for catalogs by @phillipleblanc in #10758
  • Fixes for views and resolved tables on 'spice refresh' CLI by @phillipleblanc in #10759
  • Implement FlightSQL CommandStatementSubstraitPlan support by @lukekim in #10761
  • feat(connectors): mTLS client cert support for flightsql and spiceai connectors by @phillipleblanc in #10764
  • Allow arbitrary filenames when specifying spicepod path + kind validation by @krinart in #10777
  • fix: ignore field metadata in schema compatibility check in index_table_scan by @Jeadie in #10778
  • Display pushed-down limits in EXPLAIN TREE output by @lukekim in #10779
  • fix: enable streaming append for Kafka with Cayenne accelerator by @lukekim in #10780
  • fix: bound chunked-index intermediate batch size to prevent OOM by @phillipleblanc in #10783
  • fix: label all columns in spice cloud metrics table output by @claudespice in #10784
  • fix: use checked arithmetic for Turso integer-millis timestamp read path by @claudespice in #10786
  • fix: use checked arithmetic in timestamp-to-nanosecond conversions by @claudespice in #10666
  • Upgrade to DuckDB v1.5.2 by @sgrebnov in #10788
  • Improve CDC ingestion performance by @lukekim in #10789
  • Fix tool_search/tool_invoke spans by @lukekim in #10791
  • Add Cayenne inline mutations and benchmark coverage by @lukekim in #10792
  • Ensure we always resolve table names in distributed mode/metadata by @Jeadie in #10793
  • Remove permanent errors from DynamoDB Streams by @krinart in #10794
  • Add expanded view mode for wide table display in SQL REPL by @lukekim in #10797
  • Fix Cayenne CDC schema mismatch error by @sgrebnov in #10800
  • Executors should create catalog tables on join by @Jeadie in #10807
  • Add compressed file support for listing connectors by @lukekim in #10809
  • Improve Cayenne mutation, scan, and inline memtable scaling by @lukekim in #10811
  • Add range fallback for large join filters by @lukekim in #10816
  • Improve Cayenne join filter pushdown by @lukekim in #10818
  • Synchronize Cayenne partition commits across partitions by @phillipleblanc in #10819
  • fix: Deny nondistributed cayenne catalog by @peasee in #10821
  • Enable parallel Cayenne Vortex writes by @lukekim in #10822
  • Expand Arrow type handling in formatting and Elasticsearch by @lukekim in #10825
  • Add response.output_text.delta to responses API by @krinart in #10828
  • feat(cayenne): add join filter propagation and no-spill Q21 planning by @lukekim in #10840
  • Upgrade Turso to v0.6.0 by @sgrebnov in #10843
  • feat(cli): add spice feedback command to open community Slack by @lukekim in #10856
  • Upgrade iceberg to v0.9.1 by @sgrebnov in #10859
  • feat(cluster): per-request executor readiness gate on /v1/ready by @phillipleblanc in #10860
  • fix: Require dim-side statistics for CayennePropagateFilterAcrossEquiJoinKeys by @sgrebnov in #10863
  • fix: Debezium schema evolution breaks dataset init on reload by @claudespice in #10144
  • fix(mssql): Push topK limit to SQL Server for non-nullable sort columns by @Jeadie in #10621
  • fix(ScyllaDB): disable physical filter pushdown by @sgrebnov in #10772
  • fix: handle typed NULLs and prevent overflow in DynamoDB DML type conversions by @krinart in #10511
  • fix: use InsertOp::Overwrite in DynamoDB bootstrap scan_and_overwrite_accelerator by @krinart in #10639
  • Improve DynamoDB Bootstrap performance by @krinart in #10616
  • fix: preserve field and schema metadata in Vortex type transformation by @lukekim in #10628
  • fix: GH connector - explicitly use AWS LC RS crypto provider for jwt by @phillipleblanc in #10619
  • fix: add snapshot mode guards to delete_from/update and delegate DML in SwappableTableProvider by @phillipleblanc in #10685
  • Persist HTTP rate-control state in object storage by @lukekim in #10697
  • Rate limit metrics HTTP endpoint by @lukekim in #10162
  • feat(geo): add optional spatial SQL UDF support by @lukekim in #10833
  • feat(cayenne): CDC throughput, compaction, scan caching, and benchmarks by @lukekim in #10852
  • fix(cayenne): fix Vortex panic on highly compressible data by @sgrebnov in #10855
  • fix(cayenne): Read live protected snapshots after cleanup grace period by @sgrebnov in #10901
  • fix: Disable Cayenne HashJoin rewriter optimizer by @sgrebnov in #10882
  • Fix GetFlightInfo vs DoGet Flight Schema by @krinart in #10864
  • fix(search): preserve column casing in /v1/search primary key plumbing by @claudespice in #10909
  • fix(object-store): dedupe s3 url style auto-detection log by @phillipleblanc in #10898
  • Improve Spice CLI manifest editing and direct command modes by @lukekim in #10815
  • Persist Kafka CDC offsets in sidecar tables by @lukekim in #10823
  • feat(task-history): record Ballista stages for distributed queries by @phillipleblanc in #10831
  • Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls by @Jeadie in #10795
  • Optimize Cayenne catalog maintenance paths by @lukekim in #10904
  • Centralize DuckDB settings for accelerator by @ewgenius in #10895
  • deps(ballista): bump to 47e2b494 to fix S3 shuffle reads under cluster mode by @phillipleblanc in #10910
  • Authorization header + Bump async-openai + responses_adapter fix by @krinart in #10911
  • Tune accelerators by storage profile by @lukekim in #10913
  • feat: add dataset-level on_schema_change config by @lukekim in #10908
  • Handle NULL sentinel for nullable partition expressions by @Jeadie in #10880
  • fix: Remove Cayenne Catalog from catalog registration by @peasee in #10914
  • Add catalog name to foreign key metadata in postgres catalog by @Jeadie in #10917
  • Cayenne perf: eliminate redundant clones, PK point-lookup fanout fix, IN-list rewrite + microbench coverage by @lukekim in #10916
  • fix(turso-shared): retry on Turso BEGIN CONCURRENT "Write-write conflict" by @lukekim in #10946
  • Vendor Vortex DataFusion for Cayenne by @lukekim in #10933
  • perf(cayenne): background retention + enable CDC pipelining for retention-configured tables by @lukekim in #10936
  • feat(cayenne): scale metastore pool to 32 + vs_duckdb_scaling benches (1โ†’128 concurrency, sqlite + turso lanes) by @lukekim in #10943
  • feat(mcp): support auth for streamable HTTP tools by @phillipleblanc in #10927
  • Explicit error if v1/search requests a table without search index by @Jeadie in #10968
  • Fix spicepod loading failure when directory name contains dots by @sgrebnov in #10958
  • Extend append tests with arrow engine configurations by @sgrebnov in #10959
  • Remove dataset on_schema_change Policy from rc.5 release notes by @sgrebnov in #10964
  • Skip tpcds_q78 for Cayenne engine at SF100 by @sgrebnov in #10966
  • fix: Update benchmark snapshots May-20 by @app/github-actions in #10952
  • Fix #10951: UdtfExec invariant Vec lengths must match children count by @phillipleblanc in #10953
  • docs(release): update v2.0.0-rc.5 notes with latest trunk PRs by @lukekim in #10949
  • Remove eval related things for v2.0.0 by @Jeadie in #10945
  • build(deps): bump ubuntu from 24.04 to 26.04 in the docker-dependencies group by @app/dependabot in #10883
  • fix: Add publish = false to chbench-driver by @sgrebnov in #10939
  • [Bug] Timing between reconnect and AllocateInitialPartitions leaves connection without flight_sql_client by @Jeadie in #10805
  • Fix: refresh_mode: snapshot reports Ready with empty data when no snapshot exists by @sgrebnov in #10979
  • fix(cluster): gate scheduler readiness on executor partition loads by @phillipleblanc in #10992
  • fix: handle EXISTS/NOT EXISTS subqueries in federation analyzer by @sgrebnov in #10996
  • Refactor spice dataset configuration command by @Jeadie in #10999
  • fix: preserve field and schema metadata in Vortex physical schema calculation by @claudespice in #11013
  • fix: validate Snowflake account identifiers and auth config by @Jeadie in #11024
  • Fix Unity Catalog connector deserialization failure with OSS Unity Catalog by @ewgenius in #11026
  • feat(cayenne): allow inline writes with pending deletions (deletes/upserts) by @sgrebnov in #11031
  • Expose metadata descriptions via PostgreSQL UDFs by @lukekim in #11032
  • Remove default runtime features - enable explicitly in spiced by @phillipleblanc in #11037
  • feat(cayenne): fast-path CDC deletes by extracting PK values from filters by @sgrebnov in #11049
  • Cayenne optimizer rules: auto relevance test for q21-shape (all-Cayenne CH-Bench) and runtime rule selection by @lukekim in #11050
  • refactor(cdc): reduce CDC sub-batch splits for interleaved upsert/delete workloads by @sgrebnov in #11051
  • fix(snowflake): enforce function deny-list in federation pushdown by @claudespice in #11057
  • fix(mcp): trace external server tool calls in task history by @ewgenius in #11058
  • perf(cdc): Last-write-wins dedup in group_into_sub_batches to reduce sub-batch splits by @sgrebnov in #11059
  • PM edits to v2.0.0-rc5 by @lukekim in #11067
  • fix(snowflake): wire deny-list in extracted connector crate (#10703) by @claudespice in #11071
  • perf(cayenne): keep CDC upsert PK keysets resident to avoid per-batch full-table rebuilds by @lukekim in #11074
  • Fix metadata on search indexing by @Jeadie in #11080
  • feat(cayenne): merge-on-read position deletes for PK upsert tables + memory-pool accounting by @lukekim in #11085
  • perf(cayenne): scale CDC inline flush caps with memory + storage class by @lukekim in #11087
  • feat(cluster): report per-executor table statistics so distributed JoinSelection can size joins by @phillipleblanc in #11089
  • Improve Cayenne CDC write and compaction path tracing by @sgrebnov in #11091
  • Support tuple-IN composite PK extraction in Cayenne delete fast-path by @sgrebnov in #11093
  • feat(cluster): NDV-aware executor stats so CDC q18 join swap fires by @phillipleblanc in #11098
  • feat(cayenne): maintain join-sizing stats on the write path by @phillipleblanc in #11104
  • fix(cache): run periodic moka maintenance for idle caches by @phillipleblanc in #11106
  • Upgrade to DuckDB 1.5.3 + statically link the VSS (HNSW) extension by @sgrebnov in #11107
  • Fix fetched_at for HTTP connector by @Jeadie in #11116
  • fix(cayenne): tombstone inline-checkpointed rows on upsert to prevent duplicate PKs by @sgrebnov in #11129
  • feat: dedicated compaction runtime for Cayenne + CDC pipelining, protected snapshots, and test coverage by @lukekim in #11130
  • Add datasets dimension to the query_executions metric by @phillipleblanc in #11138
  • Fix #11137: localpod child not tracking parent refreshes with in-memory (arrow) parent accelerator by @phillipleblanc in #11139
  • Fix Windows build: vendor the VSS extension (drop nested submodule) by @phillipleblanc in #11140
  • fix(spiceai): keep correlated subqueries out of JOIN ON for Spice Cloud federation by @phillipleblanc in #11143
  • Refactor spice dataset configuration command by @Jeadie in #10999
  • feat(cayenne): sharded parallel Vortex encode with key/time clustering by @lukekim in #11144
  • fix(cluster): prevent DoPut write pipeline self-deadlock under ingest backpressure by @phillipleblanc in #11160
  • fix(cayenne): only warn on genuine protected-snapshot amplification by @lukekim in #11158

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.11.6...v2.0.0

Spice v2.0-rc.5 (May 27, 2026)

ยท 30 min read
Jack Eadie
Token Plumber at Spice AI

Spice v2.0-rc.5 is now available! ๐Ÿ”ฅ

v2.0.0-rc.5 is the fifth release candidate for advanced testing of v2.0, building on v2.0.0-rc.4.

This release completes the mTLS implementation across server endpoints and outbound connectors, adds MongoDB Change Streams and durable Kafka offset persistence as new CDC sources, expands DML write-back to PostgreSQL, Snowflake, and Arrow, promotes DuckLake to Beta, introduces user-defined functions, on-demand dataset loading, unified query cancellation, dynamic HTTP request headers and subquery-driven request parameters, provider-aware LLM prompt caching, and a long list of Cayenne performance improvements.

Highlights in this release candidate include:

  • Spice Cayenne โ€” CDC throughput, compaction and scan caching, synchronized partition commits, join filter propagation, parallel Vortex writes, lock-free deletion caches
  • Mutual TLS (mTLS) โ€” TLS cert hot-reload, public mTLS for HTTP and Flight (channel + identity modes), mTLS client certs for FlightSQL and Spice.ai connectors
  • MongoDB Change Streams โ€” native real-time CDC for MongoDB, no Debezium or Kafka required
  • Kafka CDC offsets โ€” offsets persisted in sidecar tables for durable, resumable Kafka CDC
  • PostgreSQL DML โ€” INSERT, UPDATE, DELETE write-back on PostgreSQL datasets
  • Snowflake DML โ€” INSERT, UPDATE, DELETE write-back on Snowflake datasets
  • Arrow Primary Key Upserts โ€” native upsert path using primary key matching
  • DuckLake promoted to Beta โ€” with INSERT support on catalog tables
  • User-Defined Functions โ€” define SQL UDFs in spicepods, plus remote UDFs over HTTP (Spice.ai Enterprise)
  • Spatial SQL UDFs โ€” optional geospatial UDFs (ST_*) for geometry workloads
  • On-Demand Dataset Loading โ€” datasets can be deferred and loaded on first reference
  • Unified Query Cancellation โ€” Ctrl-C and HTTP request cancellation propagate across all execution paths
  • Dynamic HTTP Connector โ€” pass-through request headers, subquery-driven params, and JSON schema decomposition
  • HTTP Rate-Control persistence โ€” rate-limit state persisted in object storage across restarts
  • refresh_mode: snapshot โ€” point-in-time snapshot acceleration with SQLite/Turso WAL flushing
  • Storage-profile accelerator tuning โ€” accelerators auto-tune defaults based on local SSD, EBS-class disk, or tmpfs
  • Provider-Aware LLM Prompt Caching โ€” automatic prompt caching for OpenAI-compatible providers that support it
  • Responses API โ€” support across all model providers with streaming response.output_text.delta, plus Authorization: Bearer header support

What's New in v2.0.0-rc.5โ€‹

Cayenne Improvementsโ€‹

Significant performance work across Spice Cayenne-backed catalogs and accelerators.

  • Ingest throughput: End-to-end improvements to CDC ingest, background compaction, and a new scan-result cache for hot reads; parallel Vortex partition writes; lock-free deletion caches with bloom-prefiltered probes; background retention with CDC pipelining; SQLite metastore pool scaled to 32 for high-concurrency mutation workloads.
  • Data inlining: Small writes are serialized as Arrow IPC and committed directly into the Cayenne metastore (cayenne_inlined_data), bypassing the staged Vortex write path for low-latency ingest. Inline upserts atomically rewrite existing inline rows instead of emitting side delete markers, and inline data remains query-visible via an in-memory union scan with a generation-keyed decode cache. Inline rows are checkpointed to Vortex when row, segment, or byte thresholds are reached. Defaults are refresh-mode aware: inline writes are enabled by default for high-frequency caching, changes, and fast append workloads and disabled for full, snapshot, and slower append.
  • Query planning: Join filter propagation across equi-join keys (gated behind runtime.params.cayenne_filter_propagation), range fallback for large join filters, hot-path clone elimination, and IN-list rewrites for large filter lists.
  • Correctness: Synchronized partition commits across partitions, correct NULL-sentinel handling for nullable partition expressions (e.g. bucket(N, col)), Vortex panic fix on highly compressible data, and live reads through expired protected snapshots.
  • Catalog and platform: Refresh-mode-aware compaction defaults, rejection of non-distributed Cayenne catalog configurations, and a vendored Vortex DataFusion integration for faster iteration on the Cayenne planner.

Mutual TLS (mTLS)โ€‹

Spice.ai Enterprise feature. See Enterprise Security.

Spice now supports full mutual TLS for both HTTP and Arrow Flight endpoints.

TLS cert hot-reload (#10727): The Spice runtime watches for SIGHUP and reloads TLS certificates without restarting, enabling cert rotation with zero downtime.

Public mTLS for HTTP and Flight (#10753): Two client_auth_mode values control how the server handles client certificates:

  • request โ€” optional mTLS: the server requests a client cert but accepts connections without one (useful for migration windows).
  • required โ€” strict mTLS: the server requires a valid client cert signed by the configured CA.

mTLS client certs for FlightSQL and Spice.ai connectors (#10764): Outbound connections from the FlightSQL and Spice.ai data connectors can now present client certificates for mutual authentication with upstream services.

Example configuration:

runtime:
tls:
enabled: true
certificate_file: /etc/spice/tls/server.crt
key_file: /etc/spice/tls/server.key
client_auth_mode: required
client_auth_ca_file: /etc/spice/tls/client-ca.crt

MongoDB Change Streamsโ€‹

MongoDB datasets configured with refresh_mode: changes now stream changes from MongoDB Change Streams into any local accelerator (#10813), providing real-time CDC without Debezium or Kafka.

Example configuration:

datasets:
- from: mongodb:my_collection
name: my_collection
params:
host: my-cluster.mongodb.net
db: mydb
acceleration:
enabled: true
engine: duckdb
refresh_mode: changes

CDC Improvementsโ€‹

See Change Data Capture (CDC) for an overview of CDC in Spice.

  • Kafka CDC offset persistence (#10823): Kafka CDC offsets are persisted in sidecar tables for durable, resumable streams. On restart or failover, Spice resumes from the last committed offset.
  • Pipelined CDC ingestion (#10676): Source reads overlap with batch apply, with additional batching, envelope coalescing, and nullability propagation improvements across the apply pipeline.
  • Debezium schema evolution fix (#10144): Schema changes in Debezium-sourced datasets no longer break dataset initialization on reload (fixes #9782).

PostgreSQL DML Supportโ€‹

The PostgreSQL data connector now supports write-back via INSERT, UPDATE, and DELETE operations (#10446). Combined with the existing read-side federation, PostgreSQL-backed datasets can serve as full read/write tables. The PostgreSQL Catalog connector additionally exposes foreign-key metadata for NSQL and query planning (#10849).

Snowflake DML Supportโ€‹

The Snowflake data connector now supports write-back via INSERT, UPDATE, and DELETE operations (#10747), complementing its existing read capabilities.

Arrow Primary Key Upsertsโ€‹

Arrow-accelerated tables now support native upsert operations using primary key matching (#10749), providing efficient update-or-insert semantics for in-memory datasets.

DuckLake Promoted to Betaโ€‹

The DuckLake Catalog and Data Connector are promoted to Beta quality (#10743).

DuckLake catalog tables with read_write access now support INSERT operations (#10744), enabling full read/write workflows against DuckLake-backed catalogs. The DuckLake connector also gains a series of correctness fixes for downcast, module registration, schema discovery, and S3 credentials (#10650).

User-Defined Functionsโ€‹

Spice now supports user-defined functions (UDFs) as a first-class spicepod component (#10571), letting you define reusable SQL functions in the spicepod or invoke remote functions over HTTP. The runtime also gains table user functions with HTTP server gating (#10675).

A security fix closes a remote-UDF SSRF vector (#10757).

Spatial SQL UDFsโ€‹

Spice now ships an optional set of geospatial SQL UDFs (ST_*) for geometry workloads (#10833). The functions are gated behind a build feature and can be invoked from any SQL surface.

On-Demand Dataset Loadingโ€‹

Datasets can now be marked for on-demand loading (#10629). Deferred datasets are registered with a declared schema at startup (#10669) and only fully resolve when first referenced, reducing startup time and memory footprint for spicepods with many seldom-used datasets.

Spicepods also gain columns[].type and columns[].nullable (#10661) with a lenient type parser for declaring schemas inline.

Unified Query Cancellationโ€‹

All query execution paths โ€” HTTP, Flight, FlightSQL, MCP, and internal โ€” now honour a unified cancellation signal (#10390). When a client disconnects, presses Ctrl-C in the REPL, or cancels an in-flight HTTP request, the corresponding query is cancelled end-to-end, freeing resources promptly.

Dynamic HTTP Connectorโ€‹

The HTTP data connector gains dynamic request headers parameterised from query predicates (#10604), subquery-driven request parameters for fan-out queries (#10636), HTTP response metadata as queryable columns via JSON schema decomposition (#10679), no-limit pagination (#10673), and shared rate-control across HTTP-based connectors using the same backend host (#10648).

HTTP Rate-Control Persistenceโ€‹

The HTTP rate-control state (per-endpoint throttle counters) is now persisted in object storage (#10697), ensuring rate limits survive restarts and are consistent across replicas. Rate-control metrics now use an origin label rather than the connector name for cleaner aggregation (#10689).

The metrics HTTP endpoint (/metrics) is also independently rate-limited (#10162) to prevent scraping from impacting query serving.

refresh_mode: snapshotโ€‹

Spice.ai Enterprise feature. See Acceleration Snapshots.

A new refresh_mode: snapshot provides point-in-time snapshot acceleration (#10651), with SQLite and Turso WAL flushing and a Cayenne metastore slice integration so accelerated readers see a consistent snapshot while writes continue.

Storage-Profile Accelerator Tuningโ€‹

Acceleration configs gain a new storage_profile field (#10913) with values auto (default), local_ssd, ebs, and tmpfs. Under auto, the runtime detects whether the acceleration store is backed by local SSD, EBS-class network disk, or tmpfs, and applies storage-aware defaults across DuckDB, partitioned DuckDB, SQLite, Turso, and Cayenne file-mode accelerators. Explicit per-accelerator parameters always override the profile defaults.

Provider-Aware LLM Prompt Cachingโ€‹

LLM calls automatically use provider-aware prompt caching (#10645) when the configured model provider supports it (e.g., Anthropic, OpenAI). System prompts and tool descriptions are marked for caching so repeated invocations within the cache window reuse the provider-side cached prefix, reducing latency and cost.

A new searchable registry mode for LLM tools (#10647) lets agents discover tools by semantic search rather than enumerating all tools in the system prompt, which scales to large tool inventories.

Responses API Improvementsโ€‹

The Responses API is now supported across all configured model providers (#10724). Streaming delta events via response.output_text.delta are also supported (#10828). The runtime now also accepts Authorization: Bearer headers in addition to x-api-key, bumps async-openai, and stops populating FunctionToolCall.id so OpenAI-compatible servers can assign the ID themselves (#10911).

Distributed Cluster Improvementsโ€‹

Spice.ai Enterprise feature. See High Availability.

  • Per-request executor readiness gate (#10860): /v1/ready on schedulers waits for a configurable quorum of executors before returning healthy, enabling proper rolling deployments.
  • Ballista S3 shuffle reads under cluster mode (#10910): The shuffle reader builds its S3 client from the executor pod's environment, matching the writer. Async queries with runtime.params.shuffle_location: s3://... now complete instead of failing with AccessDenied on shuffle fetches.
  • Flattened scheduler config (#10450): runtime.scheduler.partition_management.* fields are flattened directly onto runtime.scheduler and renamed under the canonical "partition assignment" terminology. See Breaking Changes.

Improvements across Caching and Search:

  • Per-principal cache namespacing (#10702): SQL, search, and caching-accelerator caches are now namespaced per authenticated principal, so cached results never cross identity boundaries.
  • DuckDB HNSW vector indexes (#10695, #10674, #10668): DuckDB-accelerated views support HNSW vector indexes for vector search, vector search SQL is rewritten to activate HNSW_INDEX_SCAN, and HNSW indexes are preserved across data refresh.

Security Improvementsโ€‹

See Authentication and TLS for configuring Spice security.

  • API key timing-position leak and remote-UDF SSRF (#10757): Closed a timing-based position-disclosure leak in API key comparison and blocked SSRF via remote UDF endpoint parameters.
  • Configurable allowed_hosts for MCP (#10638): MCP servers can be restricted to an explicit allowlist of upstream hosts.

SQL, Query, and Developer Experienceโ€‹

See the SQL Reference for the full SQL surface area.

  • SQL REPL expanded view (#10797): Toggle \x in the REPL for a vertical key-value layout on wide result sets.
  • FlightSQL Substrait plan support (#10761): The Spice runtime now implements CommandStatementSubstraitPlan, enabling clients that submit plans as Substrait-encoded protobuf.
  • MCP auth for streamable HTTP tools (#10927): Streamable HTTP MCP tools support native authentication via mcp_auth_token and mcp_headers, both with full Spice secret expansion.
  • Elasticsearch FTS engine config and index lifecycle (#10672): Direct FTS engine configuration plus index lifecycle and ingestion controls for the Elasticsearch connector.
  • Self-hosted Spice connector (#10546): Connect Spice to another self-hosted Spice runtime as a federated source.

Connector Bug Fixesโ€‹

Notable correctness fixes across the Data Connectors: DynamoDB Streams retry on transient errors (#10794) and typed-NULL handling in DML (#10511); ScyllaDB physical filter pushdown disabled to fix incorrect results (#10772); MSSQL TOP N pushdown for non-nullable sort columns (#10621); DuckLake include filter applied (#10738); DuckDB DELETE/UPDATE on full and caching refresh modes (#10632); checked arithmetic for Turso integer-millis and timestamp-to-nanosecond conversions (#10786, #10666); and Flight GetFlightInfo/DoGet schema parity (#10864). See the Changelog for the full list.

Dependency Updatesโ€‹

Dependency / ComponentVersion
DuckDBv1.5.2
Icebergv0.9.1
Tursov0.6.0
Vortexv0.69.0

Contributorsโ€‹

Breaking Changesโ€‹

Flattened runtime.scheduler configuration (#10450): The nested runtime.scheduler.partition_management block has been flattened and renamed to use the canonical "partition assignment" terminology. Migrate as follows:

# Before
runtime:
scheduler:
partition_management:
interval: 30s
max_assignments_per_cycle: 16
discovery_timeout: 10s

# After
runtime:
scheduler:
partition_assignment_interval: 30s
max_assignments_per_interval: 16
partition_discovery_timeout: 10s

Cookbook Updatesโ€‹

No new cookbook recipes.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v2.0.0-rc.5, use one of the following methods:

CLI:

spice upgrade v2.0.0-rc.5

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:2.0.0-rc.5 image:

docker pull spiceai/spiceai:2.0.0-rc.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 2.0.0-rc.5

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • Enable DML support for PostgreSQL data connector by @phillipleblanc in #10446
  • feat(postgres): support inline PEM sslrootcert by @claudespice in #10578
  • Add foreign key metadata discovery to PostgreSQL Catalog by @sgrebnov in #10849
  • Add Snowflake DML support by @lukekim in #10747
  • Add MongoDB Change Streams support by @lukekim in #10813
  • Add user-defined functions by @lukekim in #10571
  • Add table user functions and gate HTTP servers by @lukekim in #10675
  • feat: add on-demand dataset loading by @phillipleblanc in #10629
  • feat(runtime): declared-schema deferred datasets by @phillipleblanc in #10669
  • feat(spicepod, runtime): add columns[].type / nullable + lenient type parser by @phillipleblanc in #10661
  • Replace external smb crate with internal SMB 3.1.1 client by @phillipleblanc in #10516
  • Add unified query cancellation across all paths by @lukekim in #10390
  • Add dynamic HTTP request headers by @lukekim in #10604
  • feat(http): Support dynamic HTTP connector request params from subqueries by @lukekim in #10636
  • feat(http): pass through HTTP metadata columns with JSON schema decomposition by @lukekim in #10679
  • Add nolimit HTTP pagination max pages by @lukekim in #10673
  • Add shared HTTP rate control for connectors by @lukekim in #10648
  • Use origin label instead of name for HTTP rate control metrics by @lukekim in #10689
  • fix(http): reject OR across different HTTP filter columns by @lukekim in #10625
  • Add provider-aware LLM prompt caching by @lukekim in #10645
  • Add searchable registry mode for LLM tools by @lukekim in #10647
  • feat: refresh_mode: snapshot + SQLite/Turso WAL flush + Cayenne metastore slice by @phillipleblanc in #10651
  • feat: per-principal cache namespacing for SQL/search/caching-accelerator by @lukekim in #10702
  • Add self-hosted Spice connector support by @phillipleblanc in #10546
  • Add Delta Lake Azure tenant parameter by @phillipleblanc in #10671
  • Support OAuth2 client credentials in 'spice cloud login' by @ewgenius in #10586
  • Add configurable allowed_hosts for MCP by @lukekim in #10638
  • fix: make Helm chart probes configurable by @peasee in #10696
  • Strip high-cardinality datasets dim from anonymous telemetry by @lukekim in #10711
  • feat(elasticsearch): direct FTS engine config + index lifecycle and ingestion controls by @lukekim in #10672
  • Add DuckDB HNSW vector index support for accelerated views by @sgrebnov in #10695
  • Rewrite DuckDB vector search SQL to activate HNSW_INDEX_SCAN by @sgrebnov in #10674
  • Fix DuckDB HNSW vector indexes lost after data refresh by @sgrebnov in #10668
  • Fix DuckDB DELETE/UPDATE on full and caching refresh mode datasets by @phillipleblanc in #10632
  • Fix DuckLake connector: downcast, module registration, schema discovery, and S3 credentials by @sgrebnov in #10650
  • Fix federation pushing denied functions inside subqueries to remote engines by @phillipleblanc in #10692
  • fix(caching): honour refresh_on_startup: always in caching mode by @phillipleblanc in #10594
  • fix(iceberg): rebuild storage factory when Hadoop catalog scheme is inferred by @sgrebnov in #10601
  • Pipeline CDC ingestion: overlap source reads with batch apply by @lukekim in #10676
  • fix: add NULL check to CDC primary key extraction by @lukekim in #10684
  • Properly handle nullability during CDC processing by @krinart in #10803
  • Flatten scheduler config and rename partition management โ†’ partition assignment by @lukekim in #10450
  • Improve NSQL UX and harden internal LLM tools by @lukekim in #10715
  • Support Responses API across model providers by @lukekim in #10724
  • Update xAI default model and handle Grok model retirements by @Jeadie in #10723
  • Improve cli table layout by @krinart in #10725
  • TLS cert hot-reload (mTLS plan M1) by @phillipleblanc in #10727
  • Fix DuckLake catalog include filter being ignored by @phillipleblanc in #10738
  • Promote DuckLake Catalog and Data Connector to Beta quality by @sgrebnov in #10743
  • feat(ducklake): Support INSERT on catalog tables with read_write access by @sgrebnov in #10744
  • perf(cdc): coalesce envelopes and overlap commits in apply pipeline by @lukekim in #10745
  • feat: Allow full version tags in spicepod version by @peasee in #10748
  • Add Arrow primary key upserts by @lukekim in #10749
  • fix(snapshot): keep refresh_mode snapshot read-only by @phillipleblanc in #10752
  • feat(tls): public mTLS for HTTP and Flight (channel + identity modes) by @phillipleblanc in #10753
  • perf(cayenne): lock-free deletion caches with bloom-prefiltered probe by @lukekim in #10756
  • fix(security): close API key timing-position leak and remote-UDF SSRF by @lukekim in #10757
  • Fix 'wait_until_dependent_tables_are_ready' for catalogs by @phillipleblanc in #10758
  • Fixes for views and resolved tables on 'spice refresh' CLI by @phillipleblanc in #10759
  • Implement FlightSQL CommandStatementSubstraitPlan support by @lukekim in #10761
  • feat(connectors): mTLS client cert support for flightsql and spiceai connectors by @phillipleblanc in #10764
  • Allow arbitrary filenames when specifying spicepod path + kind validation by @krinart in #10777
  • fix: ignore field metadata in schema compatibility check in index_table_scan by @Jeadie in #10778
  • Display pushed-down limits in EXPLAIN TREE output by @lukekim in #10779
  • fix: enable streaming append for Kafka with Cayenne accelerator by @lukekim in #10780
  • fix: bound chunked-index intermediate batch size to prevent OOM by @phillipleblanc in #10783
  • fix: label all columns in spice cloud metrics table output by @claudespice in #10784
  • fix: use checked arithmetic for Turso integer-millis timestamp read path by @claudespice in #10786
  • fix: use checked arithmetic in timestamp-to-nanosecond conversions by @claudespice in #10666
  • Upgrade to DuckDB v1.5.2 by @sgrebnov in #10788
  • Improve CDC ingestion performance by @lukekim in #10789
  • Fix tool_search/tool_invoke spans by @lukekim in #10791
  • Add Cayenne inline mutations and benchmark coverage by @lukekim in #10792
  • Ensure we always resolve table names in distributed mode/metadata by @Jeadie in #10793
  • Remove permanent errors from DynamoDB Streams by @krinart in #10794
  • Add expanded view mode for wide table display in SQL REPL by @lukekim in #10797
  • Fix Cayenne CDC schema mismatch error by @sgrebnov in #10800
  • Executors should create catalog tables on join by @Jeadie in #10807
  • Add compressed file support for listing connectors by @lukekim in #10809
  • Improve Cayenne mutation, scan, and inline memtable scaling by @lukekim in #10811
  • Add range fallback for large join filters by @lukekim in #10816
  • Improve Cayenne join filter pushdown by @lukekim in #10818
  • Synchronize Cayenne partition commits across partitions by @phillipleblanc in #10819
  • fix: Deny nondistributed cayenne catalog by @peasee in #10821
  • Enable parallel Cayenne Vortex writes by @lukekim in #10822
  • Expand Arrow type handling in formatting and Elasticsearch by @lukekim in #10825
  • Add response.output_text.delta to responses API by @krinart in #10828
  • feat(cayenne): add join filter propagation and no-spill Q21 planning by @lukekim in #10840
  • Upgrade Turso to v0.6.0 by @sgrebnov in #10843
  • feat(cli): add spice feedback command to open community Slack by @lukekim in #10856
  • Upgrade iceberg to v0.9.1 by @sgrebnov in #10859
  • feat(cluster): per-request executor readiness gate on /v1/ready by @phillipleblanc in #10860
  • fix: Require dim-side statistics for CayennePropagateFilterAcrossEquiJoinKeys by @sgrebnov in #10863
  • fix: Debezium schema evolution breaks dataset init on reload by @claudespice in #10144
  • fix(mssql): Push topK limit to SQL Server for non-nullable sort columns by @Jeadie in #10621
  • fix(ScyllaDB): disable physical filter pushdown by @sgrebnov in #10772
  • fix: handle typed NULLs and prevent overflow in DynamoDB DML type conversions by @krinart in #10511
  • fix: use InsertOp::Overwrite in DynamoDB bootstrap scan_and_overwrite_accelerator by @krinart in #10639
  • Improve DynamoDB Bootstrap performance by @krinart in #10616
  • fix: preserve field and schema metadata in Vortex type transformation by @lukekim in #10628
  • fix: GH connector - explicitly use AWS LC RS crypto provider for jwt by @phillipleblanc in #10619
  • fix: add snapshot mode guards to delete_from/update and delegate DML in SwappableTableProvider by @phillipleblanc in #10685
  • Persist HTTP rate-control state in object storage by @lukekim in #10697
  • Rate limit metrics HTTP endpoint by @lukekim in #10162
  • feat(geo): add optional spatial SQL UDF support by @lukekim in #10833
  • feat(cayenne): CDC throughput, compaction, scan caching, and benchmarks by @lukekim in #10852
  • fix(cayenne): fix Vortex panic on highly compressible data by @sgrebnov in #10855
  • fix(cayenne): Read live protected snapshots after cleanup grace period by @sgrebnov in #10901
  • fix: Disable Cayenne HashJoin rewriter optimizer by @sgrebnov in #10882
  • Fix GetFlightInfo vs DoGet Flight Schema by @krinart in #10864
  • fix(search): preserve column casing in /v1/search primary key plumbing by @claudespice in #10909
  • fix(object-store): dedupe s3 url style auto-detection log by @phillipleblanc in #10898
  • Improve Spice CLI manifest editing and direct command modes by @lukekim in #10815
  • Persist Kafka CDC offsets in sidecar tables by @lukekim in #10823
  • feat(task-history): record Ballista stages for distributed queries by @phillipleblanc in #10831
  • Add '#[deny(clippy::missing_trait_methods)]' to wrapper/delegation trait impls by @Jeadie in #10795
  • Optimize Cayenne catalog maintenance paths by @lukekim in #10904
  • Centralize DuckDB settings for accelerator by @ewgenius in #10895
  • deps(ballista): bump to 47e2b494 to fix S3 shuffle reads under cluster mode by @phillipleblanc in #10910
  • Authorization header + Bump async-openai + responses_adapter fix by @krinart in #10911
  • Tune accelerators by storage profile by @lukekim in #10913
  • feat: add dataset-level on_schema_change config by @lukekim in #10908
  • Handle NULL sentinel for nullable partition expressions by @Jeadie in #10880
  • fix: Remove Cayenne Catalog from catalog registration by @peasee in #10914
  • Add catalog name to foreign key metadata in postgres catalog by @Jeadie in #10917
  • Cayenne perf: eliminate redundant clones, PK point-lookup fanout fix, IN-list rewrite + microbench coverage by @lukekim in #10916
  • fix(turso-shared): retry on Turso BEGIN CONCURRENT "Write-write conflict" by @lukekim in #10946
  • Vendor Vortex DataFusion for Cayenne by @lukekim in #10933
  • perf(cayenne): background retention + enable CDC pipelining for retention-configured tables by @lukekim in #10936
  • feat(cayenne): scale metastore pool to 32 + vs_duckdb_scaling benches (1โ†’128 concurrency, sqlite + turso lanes) by @lukekim in #10943
  • feat(mcp): support auth for streamable HTTP tools by @phillipleblanc in #10927
  • Explicit error if v1/search requests a table without search index by @Jeadie in #10968
  • Fix spicepod loading failure when directory name contains dots by @sgrebnov in #10958
  • Extend append tests with arrow engine configurations by @sgrebnov in #10959
  • Remove dataset on_schema_change Policy from rc.5 release notes by @sgrebnov in #10964
  • Skip tpcds_q78 for Cayenne engine at SF100 by @sgrebnov in #10966
  • fix: Update benchmark snapshots May-20 by @app/github-actions in #10952
  • Fix #10951: UdtfExec invariant Vec lengths must match children count by @phillipleblanc in #10953
  • docs(release): update v2.0.0-rc.5 notes with latest trunk PRs by @lukekim in #10949
  • Remove eval related things for v2.0.0 by @Jeadie in #10945
  • build(deps): bump ubuntu from 24.04 to 26.04 in the docker-dependencies group by @app/dependabot in #10883
  • fix: Add publish = false to chbench-driver by @sgrebnov in #10939

Full Changelog: https://github.com/spiceai/spiceai/compare/v2.0.0-rc.4...v2.0.0-rc.5

Spice v1.11.0 (Jan 28, 2026)

ยท 58 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-stable! โšก

In Spice v1.11.0, Spice Cayenne reaches Beta status with acceleration snapshots, Key-based deletion vectors, and Amazon S3 Express One Zone support. DataFusion has been upgraded to v51 along with Arrow v57.2, and iceberg-rust v0.8.0. v1.11 adds several DynamoDB & DynamoDB Streams improvements such as JSON nesting, and adds significant improvements to Distributed Query with active-active schedulers and mTLS for enterprise-grade high-availability and secure cluster communication.

This release also adds new SMB, NFS, and ScyllaDB Data Connectors (Alpha), Prepared Statements with full SDK support (gospice, spice-rs, spice-dotnet, spice-java, spice.js, and spicepy), Google LLM Support for expanded AI inference capabilities, and significant improvements to caching, observability, and Hash Indexing for Arrow Acceleration.

What's New in v1.11.0โ€‹

Spice Cayenne Accelerator Reaches Betaโ€‹

Spice Cayenne has been promoted to Beta status with acceleration snapshots support and numerous performance and stability improvements.

Key Enhancements:

  • Key-based Deletion Vectors: Improved deletion vector support using key-based lookups for more efficient data management and faster delete operations. Key-based deletion vectors are more memory-efficient than positional vectors for sparse deletions.
  • S3 Express One Zone Support: Store Cayenne data files in S3 Express One Zone for single-digit millisecond latency, ideal for latency-sensitive query workloads that require persistence.

Improved Reliability:

  • Resolved FuturesUnordered reentrant drop crashes
  • Fixed memory growth issues related to Vortex metrics allocation
  • Metadata catalog now properly respects cayenne_file_path location
  • Added warnings for unparseable configuration values

For more details, refer to the Cayenne Documentation.

DataFusion v51 Upgradeโ€‹

Apache DataFusion has been upgraded to v51, bringing significant performance improvements, new SQL features, and enhanced observability.

DataFusion v51 ClickBench Performance

Performance Improvements:

  • Faster CASE Expression Evaluation: Expressions now short-circuit earlier, reuse partial results, and avoid unnecessary scattering, speeding up common ETL patterns
  • Better Defaults for Remote Parquet Reads: DataFusion now fetches the last 512KB of Parquet files by default, typically avoiding 2 I/O requests per file
  • Faster Parquet Metadata Parsing: Leverages Arrow 57's new thrift metadata parser for up to 4x faster metadata parsing

New SQL Features:

  • SQL Pipe Operators: Support for |> syntax for inline transforms
  • DESCRIBE <query>: Returns the schema of any query without executing it
  • Named Arguments in SQL Functions: PostgreSQL-style param => value syntax for scalar, aggregate, and window functions
  • Decimal32/Decimal64 Support: New Arrow types supported including aggregations like SUM, AVG, and MIN/MAX

Example pipe operator:

SELECT * FROM t
|> WHERE a > 10
|> ORDER BY b
|> LIMIT 5;

Improved Observability:

  • Improved EXPLAIN ANALYZE Metrics: New metrics including output_bytes, selectivity for filters, reduction_factor for aggregates, and detailed timing breakdowns

Arrow 57.2 Upgradeโ€‹

Apache Arrow has been upgraded to v57.2, bringing major performance improvements and new capabilities.

Arrow 57 Parquet Metadata Parsing Performance

Key Features:

  • 4x Faster Parquet Metadata Parsing: A rewritten thrift metadata parser delivers up to 4x faster metadata parsing, especially beneficial for low-latency use cases and files with large amounts of metadata
  • Parquet Variant Support: Experimental support for reading and writing the new Parquet Variant type for semi-structured data, including shredded variant values
  • Parquet Geometry Support: Read and write support for Parquet Geometry types (GEOMETRY and GEOGRAPHY) with GeospatialStatistics
  • New arrow-avro Crate: Efficient conversion between Apache Avro and Arrow RecordBatches with projection pushdown and vectorized execution support

DynamoDB Connector Enhancementsโ€‹

  • Added JSON nesting for DynamoDB Streams
  • Improved batch deletion handling

Distributed Query Improvementsโ€‹

High Availability Clusters: Spice now supports running multiple active schedulers in an active/active configuration for production deployments. This eliminates the scheduler as a single point of failure and enables graceful handling of node failures.

  • Multiple schedulers run simultaneously, each capable of accepting queries
  • Schedulers coordinate via a shared S3-compatible object store
  • Executors discover all schedulers automatically
  • A load balancer distributes client queries across schedulers

Example HA configuration:

runtime:
scheduler:
state_location: s3://my-bucket/spice-cluster
params:
region: us-east-1

mTLS Verification: Cluster communication between scheduler and executors now supports mutual TLS verification for enhanced security.

Credential Propagation: S3, ABFS, and GCS credentials are now automatically propagated to executors in cluster mode, enabling access to cloud storage across the distributed query cluster.

Improved Resilience:

  • Exponential backoff for scheduler disconnection recovery
  • Increased gRPC message size limit from 16MB to 100MB for large query plans
  • HTTP health endpoint for cluster executors
  • Automatic executor role inference when --scheduler-address is provided

For more details, refer to the Distributed Query Documentation.

iceberg-rust v0.8.0 Upgradeโ€‹

Spice has been upgraded to iceberg-rust v0.8.0, bringing improved Iceberg table support.

Key Features:

  • V3 Metadata Support: Full support for Iceberg V3 table metadata format
  • INSERT INTO Partitioned Tables: DataFusion integration now supports inserting data into partitioned Iceberg tables
  • Improved Delete File Handling: Better support for position and equality delete files, including shared delete file loading and caching
  • SQL Catalog Updates: Implement update_table and register_table for SQL catalog
  • S3 Tables Catalog: Implement update_table for S3 Tables catalog
  • Enhanced Arrow Integration: Convert Arrow schema to Iceberg schema with auto-assigned field IDs, _file column support, and Date32 type support

Acceleration Snapshotsโ€‹

Acceleration snapshots enable point-in-time recovery and data versioning for accelerated datasets. Snapshots capture the state of accelerated data at specific points, allowing for fast bootstrap recovery and rollback capabilities.

Key Features:

  • Flexible Triggers: Configure when snapshots are created based on time intervals or stream batch counts
  • Automatic Compaction: Reduce storage overhead by compacting older snapshots (DuckDB only)
  • Bootstrap Integration: Snapshots can reset cache expiry on load for seamless recovery (DuckDB with Caching refresh mode)
  • Smart Creation Policies: Only create snapshots when data has actually changed

Example configuration:

datasets:
- from: s3://my-bucket/data.parquet
name: my_dataset
acceleration:
enabled: true
engine: cayenne
mode: file
snapshots: enabled
snapshots_trigger: time_interval
snapshots_trigger_threshold: 1h
snapshots_creation_policy: on_changed

Snapshots API and CLI: New API endpoints and CLI commands for managing snapshots programmatically.

CLI Commands:

# List all snapshots for a dataset
spice acceleration snapshots taxi_trips

# Get details of a specific snapshot
spice acceleration snapshot taxi_trips 3

# Set the current snapshot for rollback (requires runtime restart)
spice acceleration set-snapshot taxi_trips 2

HTTP API Endpoints:

MethodEndpointDescription
GET/v1/datasets/{dataset}/acceleration/snapshotsList all snapshots for a dataset
GET/v1/datasets/{dataset}/acceleration/snapshots/{id}Get details of a specific snapshot
POST/v1/datasets/{dataset}/acceleration/snapshots/currentSet the current snapshot for rollback

For more details, refer to the Acceleration Snapshots Documentation.

Caching Acceleration Mode Improvementsโ€‹

The Caching Acceleration Mode introduced in v1.10.0 has received significant performance optimizations and reliability fixes in this release.

Performance Optimizations:

  • Non-blocking Cache Writes: Cache misses no longer block query responses. Data is written to the cache asynchronously after the query returns, reducing query latency for cache miss scenarios.
  • Batch Cache Writes: Multiple cache entries are now written in batches rather than individually, significantly improving write throughput for high-volume cache operations.

Reliability Fixes:

  • Correct SWR Refresh Behavior: The stale-while-revalidate (SWR) pattern now correctly refreshes only the specific entries that were accessed instead of refreshing all stale rows in the dataset. This prevents unnecessary source queries and reduces load on upstream data sources.
  • Deduplicated Refresh Requests: Fixed an issue where JSON array responses could trigger multiple redundant refresh operations. Refresh requests are now properly deduplicated.
  • Fixed Cache Hit Detection: Resolved an issue where queries that didn't include fetched_at in their projection would always result in cache misses, even when cached data was available.
  • Unfiltered Query Optimization: SELECT * queries without filters now return cached data directly without unnecessary filtering overhead.

For more details, refer to the Caching Acceleration Mode Documentation.

Prepared Statementsโ€‹

Improved Query Performance and Security: Spice now supports prepared statements, enabling parameterized queries that improve both performance through query plan caching and security by preventing SQL injection attacks.

Key Features:

  • Query Plan Caching: Prepared statements cache query plans, reducing planning overhead for repeated queries
  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Arrow Flight SQL Support: Full prepared statement support via Arrow Flight SQL protocol

SDK Support:

SDKSupportMin VersionMethod
gospice (Go)โœ… Fullv8.0.0+SqlWithParams() with typed constructors (Int32Param, StringParam, TimestampParam, etc.)
spice-rs (Rust)โœ… Fullv3.0.0+query_with_params() with RecordBatch parameters
spice-dotnet (.NET)โœ… Fullv0.3.0+QueryWithParams() with typed parameter builders
spice-java (Java)โœ… Fullv0.5.0+queryWithParams() with typed Param constructors (Param.int64(), Param.string(), etc.)
spice.js (JavaScript)โœ… Fullv3.1.0+query() with parameterized query support
spicepy (Python)โœ… Fullv3.1.0+query() with parameterized query support

Example (Go):

import "github.com/spiceai/gospice/v8"

client, _ := spice.NewClient()
defer client.Close()

// Parameterized query with typed parameters
results, _ := client.SqlWithParams(ctx,
"SELECT * FROM products WHERE price > $1 AND category = $2",
spice.Float64Param(10.0),
spice.StringParam("electronics"),
)

Example (Java):

import ai.spice.SpiceClient;
import ai.spice.Param;
import org.apache.arrow.adbc.core.ArrowReader;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
10.0, "electronics");

// With explicit typed parameters
ArrowReader reader = client.queryWithParams(
"SELECT * FROM products WHERE price > $1 AND category = $2",
Param.float64(10.0),
Param.string("electronics"));
}

For more details, refer to the Parameterized Queries Documentation.

Spice Java SDK v0.5.0โ€‹

Parameterized Query Support for Java: The Spice Java SDK v0.5.0 introduces parameterized queries using ADBC (Arrow Database Connectivity), providing a safer and more efficient way to execute queries with dynamic parameters.

Key Features:

  • SQL Injection Prevention: Parameters are safely bound, preventing SQL injection vulnerabilities
  • Automatic Type Inference: Java types are automatically mapped to Arrow types (e.g., double โ†’ Float64, String โ†’ Utf8)
  • Explicit Type Control: Use the new Param class with typed factory methods (Param.int64(), Param.string(), Param.decimal128(), etc.) for precise control over Arrow types
  • Updated Dependencies: Apache Arrow Flight SQL upgraded to 18.3.0, plus new ADBC driver support

Example:

import ai.spice.SpiceClient;
import ai.spice.Param;

try (SpiceClient client = new SpiceClient()) {
// With automatic type inference
ArrowReader reader = client.queryWithParams(
"SELECT * FROM taxi_trips WHERE trip_distance > $1 LIMIT 10",
5.0);

// With explicit typed parameters for precise control
ArrowReader reader = client.queryWithParams(
"SELECT * FROM orders WHERE order_id = $1 AND amount >= $2",
Param.int64(12345),
Param.decimal128(new BigDecimal("99.99"), 10, 2));
}

Maven:

<dependency>
<groupId>ai.spice</groupId>
<artifactId>spiceai</artifactId>
<version>0.5.0</version>
</dependency>

For more details, refer to the Spice Java SDK Repository.

Google LLM Supportโ€‹

Expanded AI Provider Support: Spice now supports Google embedding and chat models via the Google AI provider, expanding the available LLM options for AI inference workloads alongside existing providers like OpenAI, Anthropic, and AWS Bedrock.

Key Features:

  • Google Chat Models: Access Google's Gemini models for chat completions
  • Google Embeddings: Generate embeddings using Google's text embedding models
  • Unified API: Use the same OpenAI-compatible API endpoints for all LLM providers

Example spicepod.yaml configuration:

models:
- from: google:gemini-2.0-flash
name: gemini
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

embeddings:
- from: google:text-embedding-004
name: google_embeddings
params:
google_api_key: ${secrets:GOOGLE_API_KEY}

For more details, refer to the Google LLM Documentation (see docs PR #1286).

URL Tablesโ€‹

Query data sources directly via URL in SQL without prior dataset registration. Supports S3, Azure Blob Storage, and HTTP/HTTPS URLs with automatic format detection and partition inference.

Supported Patterns:

  • Single files: SELECT * FROM 's3://bucket/data.parquet'
  • Directories/prefixes: SELECT * FROM 's3://bucket/data/'
  • Glob patterns: SELECT * FROM 's3://bucket/year=*/month=*/data.parquet'

Key Features:

  • Automatic file format detection (Parquet, CSV, JSON, etc.)
  • Hive-style partition inference with filter pushdown
  • Schema inference from files
  • Works with both SQL and DataFrame APIs

Example with hive partitioning:

-- Partitions are automatically inferred from paths
SELECT * FROM 's3://bucket/data/' WHERE year = '2024' AND month = '01'

Enable via spicepod.yml:

runtime:
params:
url_tables: enabled

Cluster Mode Async Query APIs (experimental)โ€‹

New asynchronous query APIs for long-running queries in cluster mode:

  • /v1/queries endpoint: Submit queries and retrieve results asynchronously

OpenTelemetry Improvementsโ€‹

Unified Telemetry Endpoint: OTel metrics ingestion has been consolidated to the Flight port (50051), simplifying deployment by removing the separate OTel port (50052). The push-based metrics exporter continues to support integration with OpenTelemetry collectors.

Note: This is a breaking change. Update your configurations if you were using the dedicated OTel port 50052. Internal cluster communication now uses port 50052 exclusively.

Observability Improvementsโ€‹

Enhanced Dashboards: Updated Grafana and Datadog example dashboards with:

  • Snapshot monitoring widgets
  • Improved accelerated datasets section
  • Renamed ingestion lag charts for clarity

Additional Histogram Buckets: Added more buckets to histogram metrics for better latency distribution visibility.

For more details, refer to the Monitoring Documentation.

Hash Indexing for Arrow Acceleration (experimental)โ€‹

Arrow-based accelerations now support hash indexing for faster point lookups on equality predicates. Hash indexes provide O(1) average-case lookup performance for columns with high cardinality.

Features:

  • Primary key hash index support
  • Secondary index support for non-primary key columns
  • Composite key support with proper null value handling

Example configuration:

datasets:
- from: postgres:users
name: users
acceleration:
enabled: true
engine: arrow
primary_key: user_id
indexes:
'(tenant_id, user_id)': unique # Composite hash index

For more details, refer to the Hash Index Documentation.

SMB and NFS Data Connectorsโ€‹

Network-Attached Storage Connectors: New data connectors for SMB (Server Message Block) and NFS (Network File System) protocols enable direct federated queries against network-attached storage without requiring data movement to cloud object stores.

Key Features:

  • SMB Protocol Support: Connect to Windows file shares and Samba servers with authentication support
  • NFS Protocol Support: Connect to Unix/Linux NFS exports for direct data access
  • Federated Queries: Query Parquet, CSV, JSON, and other file formats directly from network storage with full SQL support
  • Acceleration Support: Accelerate data from SMB/NFS sources using DuckDB, Spice Cayenne, or other accelerators

Example spicepod.yaml configuration:

datasets:
# SMB share
- from: smb://fileserver/share/data.parquet
name: smb_data
params:
smb_username: ${secrets:SMB_USER}
smb_password: ${secrets:SMB_PASS}

# NFS export
- from: nfs://nfsserver/export/data.parquet
name: nfs_data

For more details, refer to the Data Connectors Documentation.

ScyllaDB Data Connectorโ€‹

A new data connector for ScyllaDB, the high-performance NoSQL database compatible with Apache Cassandra. Query ScyllaDB tables directly or accelerate them for faster analytics.

Example configuration:

datasets:
- from: scylladb:my_keyspace.my_table
name: scylla_data
acceleration:
enabled: true
engine: duckdb

For more details, refer to the ScyllaDB Data Connector Documentation.

Flight SQL TLS Connection Fixesโ€‹

TLS Connection Support: Fixed TLS connection issues when using grpc+tls:// scheme with Flight SQL endpoints. Added support for custom CA certificate files via the new flightsql_tls_ca_certificate_file parameter.

Developer Experience Improvementsโ€‹

  • Turso v0.3.2 Upgrade: Upgraded Turso accelerator for improved performance and reliability
  • Rust 1.91 Upgrade: Updated to Rust 1.91 for latest language features and performance improvements
  • Spice Cloud CLI: Added spice cloud CLI commands for cloud deployment management
  • Improved Spicepod Schema: Improved JSON schema generation for better IDE support and validation
  • Acceleration Snapshots: Added configurable snapshots_create_interval for periodic acceleration snapshots independent of refresh cycles
  • Tiered Caching with Localpod: The Localpod connector now supports caching refresh mode, enabling multi-layer acceleration where a persistent cache feeds a fast in-memory cache
  • GitHub Data Connector: Added workflows and workflow runs support for GitHub repositories
  • NDJSON/LDJSON Support: Added support for Newline Delimited JSON and Line Delimited JSON file formats

Additional Improvements & Bug Fixesโ€‹

  • Model Listing: New functionality to list available models across multiple AI providers
  • DuckDB Partitioned Tables: Primary key constraints now supported in partitioned DuckDB table mode
  • Post-refresh Sorting: New on_refresh_sort_columns parameter for DuckDB enables data ordering after writes
  • Improved Install Scripts: Removed jq dependency and improved cross-platform compatibility
  • Better Error Messages: Improved error messaging for bucket UDF arguments and deprecated OpenAI parameters
  • Reliability: Fixed DynamoDB IAM role authentication with new dynamodb_auth: iam_role parameter
  • Reliability: Fixed cluster executors to use scheduler's temp_directory parameter for shuffle files
  • Reliability: Initialize secrets before object stores in cluster executor mode
  • Reliability: Added page-level retry with backoff for transient GitHub GraphQL errors
  • Performance: Improved statistics for rewritten DistributeFileScanOptimizer plans
  • Developer Experience: Added max_message_size configuration for Flight service

Contributorsโ€‹

Breaking Changesโ€‹

OTel Ingestion Port Changeโ€‹

OTel ingestion has been moved to the Flight port (50051), removing the separate OTel port 50052. Port 50052 is now used exclusively for internal cluster communication. Update your configurations if you were using the dedicated OTel port.

Distributed Query Cluster Mode Requires mTLSโ€‹

Distributed query cluster mode now requires mTLS for secure communication between cluster nodes. This is a security enhancement to prevent unauthorized nodes from joining the cluster and accessing secrets.

Migration Steps:

  1. Generate certificates using spice cluster tls init and spice cluster tls add
  2. Update scheduler and executor startup commands with --node-mtls-* arguments
  3. For development/testing, use --allow-insecure-connections to opt out of mTLS

Renamed CLI Arguments:

Old NameNew Name
--cluster-mode--role
--cluster-ca-certificate-file--node-mtls-ca-certificate-file
--cluster-certificate-file--node-mtls-certificate-file
--cluster-key-file--node-mtls-key-file
--cluster-address--node-bind-address
--cluster-advertise-address--node-advertise-address
--cluster-scheduler-url--scheduler-address

Removed CLI Arguments:

  • --cluster-api-key: Replaced by mTLS authentication

Cookbook Updatesโ€‹

New ScyllaDB Data Connector Recipe: New recipe demonstrating how to use the ScyllaDB Data Connector. See ScyllaDB Data Connector Recipe for details.

New SMB Data Connector Recipe: New recipe demonstrating how to use the SMB Data Connector. See SMB Data Connector Recipe for details.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.11.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.11.0 image:

docker pull spiceai/spiceai:1.11.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0

AWS Marketplace:

Spice is available in the AWS Marketplace.

Dependenciesโ€‹

What's Changedโ€‹

Changelogโ€‹

Spice v1.11.0-rc.3 (Jan 23, 2026)

ยท 2 min read
Viktor Yershov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.11.0-rc.3! โญ

v1.11.0-rc.3 is a patch release that includes improvements to Hash Indexing for Arrow Acceleration and fixes for TLS connections with Flight SQL endpoints.

What's New in v1.11.0-rc.3โ€‹

Hash Indexing for Arrow Acceleration (experimental)โ€‹

Arrow-based accelerations now support hash indexing for faster point lookups on equality predicates. Hash indexes provide O(1) average-case lookup performance for columns with high cardinality.

Features:

  • Primary key hash index support
  • Secondary index support for non-primary key columns
  • Composite key support with proper null value handling

Example configuration:

datasets:
- from: postgres:users
name: users
acceleration:
enabled: true
engine: arrow
primary_key: user_id
indexes:
'(tenant_id, user_id)': unique # Composite hash index

For more details, refer to the Hash Index Documentation.

Flight SQL TLS Connection Fixesโ€‹

TLS Connection Support: Fixed TLS connection issues when using grpc+tls:// scheme with Flight SQL endpoints. Added support for custom CA certificate files via the new flightsql_tls_ca_certificate_file parameter.

Contributorsโ€‹

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

No major cookbook updates.

The Spice Cookbook includes 86 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.11.0-rc.3, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:v1.11.0-rc.3 image:

docker pull spiceai/spiceai:v1.11.0-rc.3

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai --version 1.11.0-rc.3

AWS Marketplace:

Spice is available in the AWS Marketplace.

What's Changedโ€‹

Changelogโ€‹

  • Hash indexing for Arrow Acceleration by @lukekim in #8924
  • Improve validation and logging for hash indexes @lukekim in #9047
  • Fix TLS connection for grpc+tls:// Flight SQL endpoints and add custom CA certificate support @phillipleblanc in #9073

Spice v1.4.0 (June 18, 2025)

ยท 19 min read
William Croxson
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.4.0! โšก

This release upgrades DataFusion to v47 and Arrow to v55 for faster queries, more efficient Parquet/CSV handling, and improved reliability. It introduces the AWS Glue Catalog and Data Connectors for native access to Glue-managed data on S3, and adds support for Databricks U2M OAuth for secure Databricks user authentication.

New Cron-based dataset refreshes and worker schedules enable automated task management, while dataset and search results caching improvements further optimizes query, search, and RAG performance.

What's New in v1.4.0โ€‹

DataFusion v47 Highlightsโ€‹

Spice.ai is built on the DataFusion query engine. The v47 release brings:

Performance Improvements ๐Ÿš€: This release delivers major query speedups through specialized GroupsAccumulator implementations for first_value, last_value, and min/max on Duration types, eliminating unnecessary sorting and computation. TopK operations are now up to 10x faster thanks to early exit optimizations, while sort performance is further enhanced by reusing row converters, removing redundant clones, and optimizing sort-preserving merge streams. Logical operations benefit from short-circuit evaluation for AND/OR, reducing overhead, and additional enhancements address high latency from sequential metadata fetching, improve int/string comparison efficiency, and simplify logical expressions for better execution.

Bug Fixes & Compatibility Improvements ๐Ÿ› ๏ธ: The release addresses issues with external sort, aggregation, and window functions, improves handling of NULL values and type casting in arrays and binary operations, and corrects problems with complex joins and nested window expressions. It also addresses SQL unparsing for subqueries, aliases, and UNION BY NAME.

See the Apache DataFusion 47.0.0 Changelog for details.

Arrow v55 Highlightsโ€‹

Arrow v55 delivers faster Parquet gzip compression, improved array concatenation, and better support for large files (4GB+) and modular encryption. Parquet metadata reads are now more efficient, with support for range requests and enhanced compatibility for INT96 timestamps and timezones. CSV parsing is more robust, with clearer error messages. These updates boost performance, compatibility, and reliability.

See the Arrow 55.0.0 Changelog and Arrow 55.1.0 Changelog for details.

Runtime Highlightsโ€‹

Search Result Caching: Spice now supports runtime caching for search results, improving performance for subsequent searches and chat completion requests that use the document_similarity LLM tool. Caching is configurable with options like maximum size, item TTL, eviction policy, and hashing algorithm.

Example spicepod.yml configuration:

runtime:
caching:
search_results:
enabled: true
max_size: 128mb
item_ttl: 5s
eviction_policy: lru
hashing_algorithm: siphash

For more information, refer to the Caching documentation.

AWS Glue Catalog Connector Alpha: Connect to AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV tables in S3.

Example spicepod.yml configuration:

catalogs:
- from: glue
name: my_glue_catalog
params:
glue_key: <your-access-key-id>
glue_secret: <your-secret-access-key>
glue_region: <your-region>
include:
- 'testdb.hive_*'
- 'testdb.iceberg_*'
sql> show tables;
+-----------------+--------------+-------------------+------------+
| table_catalog | table_schema | table_name | table_type |
+-----------------+--------------+-------------------+------------+
| my_glue_catalog | testdb | hive_table_001 | BASE TABLE |
| my_glue_catalog | testdb | iceberg_table_001 | BASE TABLE |
| spice | runtime | task_history | BASE TABLE |
+-----------------+--------------+-------------------+------------+

For more information, refer to the Glue Catalog Connector documentation.

AWS Glue Data Connector Alpha: Connect to specific tables in AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV in S3.

Example spicepod.yml configuration:

datasets:
- from: glue:my_database.my_table
name: my_table
params:
glue_auth: key
glue_region: us-east-1
glue_key: ${secrets:AWS_ACCESS_KEY_ID}
glue_secret: ${secrets:AWS_SECRET_ACCESS_KEY}

For more information, refer to the Glue Data Connector documentation.

Databricks U2M OAuth: Spice now supports User-to-Machine (U2M) authentication for Databricks when called with a compatible client, such as the Spice Cloud Platform.

datasets:
- from: databricks:spiceai_sandbox.default.messages
name: messages
params:
databricks_endpoint: ${secrets:DATABRICKS_ENDPOINT}
databricks_cluster_id: ${secrets:DATABRICKS_CLUSTER_ID}
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}

Dataset Refresh Schedules: Accelerated datasets now support a refresh_cron parameter, automatically refreshing the dataset on a defined cron schedule. Cron scheduled refreshes respect the global dataset_refresh_parallelism parameter.

Example spicepod.yml configuration:

datasets:
- name: my_dataset
from: s3://my-bucket/my_file.parquet
acceleration:
refresh_cron: 0 0 * * * # Daily refresh at midnight

For more information, refer to the Dataset Refresh Schedules documentation.

Worker Execution Schedules: Workers now support a cron parameter and will execute an LLM-prompt or SQL query automatically on the defined cron schedule, in conjunction with a provided params.prompt.

Example spicepod.yml configuration:

workers:
- name: email_reporter
models:
- from: gpt-4o
params:
prompt: 'Inspect the latest emails, and generate a summary report for them. Post the summary report to the connected Teams channel'
cron: 0 2 * * * # Daily at 2am

For more information, refer to the Worker Execution Schedules documentation.

SQL Worker Actions: Spice now supports workers with sql actions for automated SQL query execution on a cron schedule:

workers:
- name: my_worker
cron: 0 * * * *
sql: 'SELECT * FROM lineitem'

For more information, refer to the Workers with a SQL action documentation;

Contributorsโ€‹

Breaking Changesโ€‹

  • No breaking changes.

Cookbook Updatesโ€‹

The Spice Cookbook now includes 70 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.4.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.4.0 image:

docker pull spiceai/spiceai:1.4.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Changelogโ€‹

  • Update trunk to 1.4.0-unstable (#5878) by @phillipleblanc in #5878
  • Update openapi.json (#5885) by @app/github-actions in #5885
  • feat: Testoperator reports benchmark failure summary (#5889) by @peasee in #5889
  • fix: Publish binaries to dev when platform option is all (#5905) by @peasee in #5905
  • feat: Print dispatch current test count of total (#5906) by @peasee in #5906
  • Include multiple duckdb files acceleration scenarios into testoperator dispatch (#5913) by @sgrebnov in #5913
  • feat: Support building testoperator on dev (#5915) by @peasee in #5915
  • Update spicepod.schema.json (#5927) by @app/github-actions in #5927
  • Update ROADMAP & SECURITY for 1.3.0 (#5926) by @phillipleblanc in #5926
  • docs: Update qa_analytics.csv (#5928) by @peasee in #5928
  • fix: Properly publish binaries to dev on push (#5931) by @peasee in #5931
  • Load request context extensions on every flight incoming call (#5916) by @ewgenius in #5916
  • Fix deferred loading for datasets with embeddings (#5932) by @ewgenius in #5932
  • Schedule AI benchmarks to run every Mon and Thu evening PST (#5940) by @sgrebnov in #5940
  • Fix explain plan snapshots for TPCDS queries Q36, Q70 & Q86 not being deterministic after DF 46 upgrade (#5942) by @phillipleblanc in #5942
  • chore: Upgrade to Rust 1.86 (#5945) by @peasee in #5945
  • Standardise HTTP settings across CLI (#5769) by @Jeadie in #5769
  • Fix deferred flag for Databricks SQL warehouse mode (#5958) by @ewgenius in #5958
  • Add deferred catalog loading (#5950) by @ewgenius in #5950
  • Refactor deferred_load using ComponentInitialization enum for better clarity (#5961) by @ewgenius in #5961
  • Post-release housekeeping (#5964) by @phillipleblanc in #5964
  • add LTO for release builds (#5709) by @kczimm in #5709
  • Fix dependabot/192 (#5976) by @Jeadie in #5976
  • Fix Test-to-SQL benchmark scheduled run (#5977) by @sgrebnov in #5977
  • Fix JSON to ScalarValue type conversion to match DataFusion behavior (#5979) by @sgrebnov in #5979
  • Add v1.3.1 release notes (#5978) by @lukekim in #5978
  • Regenerate nightly build workflow (#5995) by @ewgenius in #5995
  • Fix DataFusion dependency loading in Databricks request context extension (#5987) by @ewgenius in #5987
  • Update spicepod.schema.json (#6000) by @app/github-actions in #6000
  • feat: Run MySQL SF100 on dev runners (#5986) by @peasee in #5986
  • fix: Remove caching RwLock (#6001) by @peasee in #6001
  • 1.3.1 Post-release housekeeping (#6002) by @phillipleblanc in #6002
  • feat: Add initial scheduler crate (#5923) by @peasee in #5923
  • fix flight request context scope (#6004) by @ewgenius in #6004
  • fix: Ensure snapshots on different scale factors are retained (#6009) by @peasee in #6009
  • fix: Allow dev runners in dispatch files (#6011) by @peasee in #6011
  • refactor: Deprecate results_cache for caching.sql_results (#6008) by @peasee in #6008
  • Fix models benchmark results reporting (#6013) by @sgrebnov in #6013
  • fix: Run PR checks for tools/ changes (#6014) by @peasee in #6014
  • feat: Add a CronRequestChannel for scheduler (#6005) by @peasee in #6005
  • feat: Add refresh_cron acceleration parameter, start scheduler on table load (#6016) by @peasee in #6016
  • Update license check to allow dual license crates (#6021) by @sgrebnov in #6021
  • Initial worker concept (#5973) by @Jeadie in #5973
  • Don't fail if cargo-deny already installed (license check) (#6023) by @sgrebnov in #6023
  • Upgrade to DataFusion 47 and Arrow 55 (#5966) by @sgrebnov in #5966
  • Read Iceberg tables from Glue Catalog Connector (#5965) by @kczimm in #5965
  • Handle multiple highlights in v1/search UX (#5963) by @Jeadie in #5963
  • feat: Add cron scheduler configurations for workers (#6033) by @peasee in #6033
  • feat: Add search cache configuration and results wrapper (#6020) by @peasee in #6020
  • Fix GitHub Actions Ubuntu for more workflows (#6040) by @phillipleblanc in #6040
  • Fix Actions for testoperator dispatch manual (#6042) by @phillipleblanc in #6042
  • refactor: Remove worker type (#6039) by @peasee in #6039
  • feat: Support cron dataset refreshes (#6037) by @peasee in #6037
  • Upgrade datafusion-federation to 0.4.2 (#6022) by @phillipleblanc in #6022
  • Define SearchPipeline and use in runtime/vector_search.rs. (#6044) by @Jeadie in #6044
  • fix: Scheduler test when scheduler is running (#6051) by @peasee in #6051
  • doc: Spice Cloud Connector Limitation (#6035) by @Sevenannn in #6035
  • Add support for on_conflict:upsert for Arrow MemTable (#6059) by @sgrebnov in #6059
  • Enhance Arrow Flight DoPut operation tracing (#6053) by @sgrebnov in #6053
  • Update openapi.json (#6032) by @app/github-actions in #6032
  • Add tools enabled to MCP server capabilities (#6060) by @Jeadie in #6060
  • Upgrade to delta_kernel 0.11 (#6045) by @phillipleblanc in #6045
  • refactor: Replace refresh oneshot with notify (#6050) by @peasee in #6050
  • Enable Upsert OnConflictBehavior for runtime.task_history table (#6068) by @sgrebnov in #6068
  • feat: Add a workers integration test (#6069) by @peasee in #6069
  • Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071
  • Update Models Benchmarks to report unsuccessful evals as errors (#6070) by @sgrebnov in #6070
  • Revert: fix: Use HTTPS ubuntu sources (#6082) by @Sevenannn in #6082
  • Add initial support for Spice Cloud Platform management (#6089) by @sgrebnov in #6089
  • Run spiceai cloud connector TPC tests using spice dev apps (#6049) by @Sevenannn in #6049
  • feat: Add SQL worker action (#6093) by @peasee in #6093
  • Post-release housekeeping (#6097) by @phillipleblanc in #6097
  • Fix search bench (#6091) by @Jeadie in #6091
  • fix: Update benchmark snapshots (#6094) by @app/github-actions in #6094
  • fix: Update benchmark snapshots (#6095) by @app/github-actions in #6095
  • Glue catalog connector for hive style parquet (#6054) by @kczimm in #6054
  • Update openapi.json (#6100) by @app/github-actions in #6100
  • Improve Flight Client DoPut / Publish error handling (#6105) by @sgrebnov in #6105
  • Define PostApplyCandidateGeneration to handle all filters & projections. (#6096) by @Jeadie in #6096
  • refactor: Update the tracing task names for scheduled tasks (#6101) by @peasee in #6101
  • task: Switch GH runners in PR and testoperator (#6052) by @peasee in #6052
  • feat: Connect search caching for HTTP and tools (#6108) by @peasee in #6108
  • test: Add multi-dataset cron test (#6102) by @peasee in #6102
  • Sanitize the ListingTableURL (#6110) by @phillipleblanc in #6110
  • Avoid partial writes by FlightTableWriter (#6104) by @sgrebnov in #6104
  • fix: Update the TPCDS postgres acceleration indexes (#6111) by @peasee in #6111
  • Make Glue Catalog refreshable (#6103) by @kczimm in #6103
  • Refactor Glue catalog to use a new Glue data connector (#6125) by @kczimm in #6125
  • Emit retry error on flight transient connection failure (#6123) by @Sevenannn in #6123
  • Update Flight DoPut implementation to send single final PutResult (#6124) by @sgrebnov in #6124
  • feat: Add metrics for search results cache (#6129) by @peasee in #6129
  • update MCP crate (#6130) by @Jeadie in #6130
  • feat: Add search cache status header, respect cache control (#6131) by @peasee in #6131
  • fix: Allow specifying individual caching blocks (#6133) by @peasee in #6133
  • Update openapi.json (#6132) by @app/github-actions in #6132
  • Add CSV support to Glue data connector (#6138) by @kczimm in #6138
  • Update Spice Cloud Platform management UX (#6140) by @sgrebnov in #6140
  • Add TPCH bench for Glue catalog (#6055) by @kczimm in #6055
  • Enforce max_tokens_per_request limit in OpenAI embedding logic (#6144) by @sgrebnov in #6144
  • Enable Spice Cloud Control Plane connect (management) for FinanceBench (#6147) by @sgrebnov in #6147
  • Add integration test for Spice Cloud Platform management (#6150) by @sgrebnov in #6150
  • fix: Invalidate search cache on refresh (#6137) by @peasee in #6137
  • fix: Prevent registering cron schedule with change stream accelerations (#6152) by @peasee in #6152
  • test: Add an append cron integration test (#6151) by @peasee in #6151
  • fix: Cache search results with no-cache directive (#6155) by @peasee in #6155
  • fix: Glue catalog dispatch runner type (#6157) by @peasee in #6157
  • Fix: Glue S3 location for directories and Iceberg credentials (#6174) by @kczimm in #6174
  • Support multiple columns in FTS (#6156) by @Jeadie in #6156
  • fix: Add --cache-control flag for search CLI (#6158) by @peasee in #6158
  • Add Glue data connector tpch bench test for parquet and csv (#6170) by @kczimm in #6170
  • fix: Apply results cache deprecation correctly (#6177) by @peasee in #6177
  • Fix regression in Parquet pushdown (#6178) by @phillipleblanc in #6178
  • Fix CUDA build (use candle-core 0.8.4 and cudarc v0.12) (#6181) by @sgrebnov in #6181
  • return empty stream if no external_links present (#6192) by @kczimm in #6192
  • Use arrow pretty print util instead of init dataframe / logical plan in display_records (#6191) by @Sevenannn in #6191
  • task: Enable additional TPCDS test scenarios in dispatcher (#6160) by @peasee in #6160
  • chore: Update dependencies (#6196) by @peasee in #6196
  • Fix FlightSQL GetDbSchemas and GetTables schemas to fully match the protocol (#6197) by @sgrebnov in #6197
  • Use spice-rs in test operator and retry on connection reset error (#6136) by @Sevenannn in #6136
  • Fix load status metric description (#6219) by @phillipleblanc in #6219
  • Run extended tests on PRs against release branch, update glue_iceberg_integration_test_catalog test (#6204) by @Sevenannn in #6204
  • query schema for is_nullable (#6229) by @kczimm in #6229
  • fix: use the query error message when queries fail (#6228) by @kczimm in #6228
  • fix glue iceberg catalog integration test (#6249) by @Sevenannn in #6249
  • cache table providers in glue catalog (#6252) by @kczimm in #6252
  • fix: databricks sql_warehouse schema contains duplicate fields (#6255) by @phillipleblanc in #6255

Full Changelog: v1.3.2...v1.4.0

Spice v1.2.0 (Apr 28, 2025)

ยท 16 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.2.0! ๐Ÿš€

Spice v1.2.0 is a significant update. It upgrades DataFusion to v45 and Arrow to v54. This release brings faster query performance, support for parameterized queries in SQL and HTTP APIs, and the ability to accelerate views. Several bugs have been fixed and dependencies updated for better stability and speed.

DataFusion v45 Highlightsโ€‹

Spice.ai is built on the DataFusion query engine. The v45 release brings:

  • Faster Performance ๐Ÿš€: DataFusion is now the fastest single-node engine for Apache Parquet files in the clickbench benchmark. Performance improved by over 33% from v33 to v45. Arrow StringView is now on by default, making string and binary data queries much faster, especially with Parquet files.

  • Better Quality ๐Ÿ“‹: DataFusion now runs over 5 million SQL tests per push using the SQLite sqllogictest suite. There are new checks for logical plan correctness and more thorough pre-release testing.

  • New SQL Functions โœจ: Added show functions, to_local_time, regexp_count, map_extract, array_distance, array_any_value, greatest, least, and arrays_overlap.

See the DataFusion 45.0.0 release notes for details.

Spice.ai upgrades to the latest minus one DataFusion release to ensure adequate testing and stability. The next upgrade to DataFusion v46 is planned for Spice v1.3.0 in May.

What's New in v1.2.0โ€‹

  • Parameterized Queries: Parameterized queries are now supported with the Flight SQL API and HTTP API. Positional and named arguments via $1 and :param syntax are supported, respectively. Logical plans for SQL statements are cached for faster repeated queries.

    Example Cookbook recipes:

    See the API Documentation for additional details.

  • Accelerated Views: Views, not just datasets, can now be accelerated. This provides much better performance for views that perform heavy computation.

    Example spicepod.yaml:

    views:
    - name: accelerated_view
    acceleration:
    enabled: true
    engine: duckdb
    primary_key: id
    refresh_check_interval: 1h
    sql: |
    select * from dataset_a
    union all
    select * from dataset_b

    See the Data Acceleration documentation.

  • Memory Usage Metrics & Configuration: Runtime now tracks memory usage as a metric, and a new runtime memory_limit parameter is available. The memory limit parameter applies specifically to the runtime and should be used in addition to existing memory usage configuration, such as duckdb_memory_limit. Memory usage for queries beyond the memory limit will spill to disk.

    See the Memory Reference for details.

  • New Worker Component: Workers are new configurable compute units in the Spice runtime. They help manage compute across models and tools, handle errors, and balance load. Workers are configured in the workers section of spicepod.yaml.

    Example spicepod.yaml:

    workers:
    - name: round-robin
    description: |
    Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
    models:
    - from: foo
    - from: bar
    - name: fallback
    description: |
    Tries 'bar' first, then 'foo', then 'baz' if earlier models fail.
    models:
    - from: foo
    order: 2
    - from: bar
    order: 1
    - from: baz
    order: 3

    See the Workers Documentation for details.

  • Databricks Model Provider: Databricks models can now be used with from: databricks:model_name.

    Example spicepod.yaml:

    models:
    - from: databricks:llama-3_2_1_1b_instruct
    name: llama-instruct
    params:
    databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
    databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }

See the Databricks model documentation.

  • spice chat CLI Improvements: The spice chat command now supports an optional --temperature parameter. A one-shot chat can also be sent with spice chat <message>.

  • More Type Support: Added support for Postgres JSON type and DuckDB Dictionary type.

  • Other Improvements:

    • New image tags let you pick memory allocators for different use-cases: jemalloc, sysalloc, and mimalloc.
    • Better error handling and logging for chat and model operations.

Contributorsโ€‹

Cookbook Updatesโ€‹

New recipes for:

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgradingโ€‹

To upgrade to v1.2.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.0 image:

docker pull spiceai/spiceai:1.2.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Spice is now built with Rust 1.85.0 and Rust 2024.

Changelogโ€‹

- Update end_game.md (#5312) by @peasee in https://github.com/spiceai/spiceai/pull/5312
- feat: Add initial testoperator query validation (#5311) by @peasee in https://github.com/spiceai/spiceai/pull/5311
- Update Helm + Prepare for next release (#5317) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5317
- Update spicepod.schema.json (#5319) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5319
- add integration test for reading encrypted PDFs from S3 (#5308) by @kczimm in https://github.com/spiceai/spiceai/pull/5308
- Stop `load_components` during runtime shutdown (#5306) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5306
- Update openapi.json (#5321) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5321
- feat: Implement record batch data validation (#5331) by @peasee in https://github.com/spiceai/spiceai/pull/5331
- Update QA analytics for v1.1.1 (#5320) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5320
- fix: Update benchmark snapshots (#5337) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5337
- Enforce pulls with Spice v1.0.4 (#5339) by @lukekim in https://github.com/spiceai/spiceai/pull/5339
- Upgrade to DataFusion 45, Arrow 54, Rust 1.85 & Edition 2024 (#5334) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5334
- feat: Allow validating testoperator in benchmark workflow (#5342) by @peasee in https://github.com/spiceai/spiceai/pull/5342
- Upgrade `delta_kernel` to 0.9 (#5343) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5343
- deps: Update odbc-api (#5344) by @peasee in https://github.com/spiceai/spiceai/pull/5344
- Fix schema inference for Snowflake tables with large number of columns (#5348) by @ewgenius in https://github.com/spiceai/spiceai/pull/5348
- feat: Update testoperator dispatch for validation, version metric (#5349) by @peasee in https://github.com/spiceai/spiceai/pull/5349
- fix: validate_results not validate (#5352) by @peasee in https://github.com/spiceai/spiceai/pull/5352
- revert to previous pdf-extract; remove test for encrypted pdf support (#5355) by @kczimm in https://github.com/spiceai/spiceai/pull/5355
- Stablize the test `verify_similarity_search_chat_completion` (#5284) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5284
- Turn off `delta_kernel::log_segment` logging and refactor log filtering (#5367) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5367
- Upgrade to DuckDB 1.2.2 (#5375) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5375
- Update Readme - fix broken and outdated links (#5376) by @ewgenius in https://github.com/spiceai/spiceai/pull/5376
- Upgrade dependabot dependencies (#5385) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5385
- fix: Remove IMAP oauth (#5386) by @peasee in https://github.com/spiceai/spiceai/pull/5386
- Bump Helm chart to 1.1.2 (#5389) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5389
- Refactor accelerator registry as part of runtime. (#5318) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5318
- Include `vnd.spiceai.sql/nsql.v1+json` response examples (openapi docs) (#5388) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5388
- docs: Update endgame template with SpiceQA, update qa analytics (#5391) by @peasee in https://github.com/spiceai/spiceai/pull/5391
- Make graceful shutdown timeout configurable (#5358) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5358
- docs: Update release criteria with note on max columns (#5401) by @peasee in https://github.com/spiceai/spiceai/pull/5401
- Update openapi.json (#5392) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5392
- FinanceBench: update scorer instructions and switch scoring model to `gpt-4.1` (#5395) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5395
- feat: Write OTel metrics for testoperator (#5397) by @peasee in https://github.com/spiceai/spiceai/pull/5397
- Update nsql openapi title (#5403) by @ewgenius in https://github.com/spiceai/spiceai/pull/5403
- Track `ai_inferences_count` with used tools flag. Extensible runtime request context. (#5393) by @ewgenius in https://github.com/spiceai/spiceai/pull/5393
- Include newly detected view as changed view (#5408) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5408
- Track used_tools in ai_inferences_with_spice_count as number (#5409) by @ewgenius in https://github.com/spiceai/spiceai/pull/5409
- Update openapi.json (#5406) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5406
- Tweak enforce pulls with Spice (#5411) by @lukekim in https://github.com/spiceai/spiceai/pull/5411
- Allow `flightsql` and `spiceai` connectors to override flight max message size (#5407) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5407
- Retry model graded scorer once on successful, empty response (#5405) by @Jeadie in https://github.com/spiceai/spiceai/pull/5405
- use span task name in 'spice trace' tree, not span_id (#5412) by @Jeadie in https://github.com/spiceai/spiceai/pull/5412
- Rename to `track_ai_inferences_with_spice_count` in all places (#5410) by @ewgenius in https://github.com/spiceai/spiceai/pull/5410
- Update qa_analytics.csv (#5421) by @peasee in https://github.com/spiceai/spiceai/pull/5421
- Remove the filter for the `list_datasets` tool in the AI inferences metric count. (#5417) by @ewgenius in https://github.com/spiceai/spiceai/pull/5417
- fix: Testoperator uses an exact API key for benchmark metric submission (#5413) by @peasee in https://github.com/spiceai/spiceai/pull/5413
- feat: Enable testoperator metrics in workflow (#5422) by @peasee in https://github.com/spiceai/spiceai/pull/5422
- Upgrade mistral.rs (#5404) by @Jeadie in https://github.com/spiceai/spiceai/pull/5404
- Include all FinanceBench documents in benchmark tests (#5426) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5426
- Handle second Ctrl-C to force runtime termination (#5427) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5427
- Add optional `--temperature` parameter for `spice chat` CLI command (#5429) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5429
- Remove `with_runtime_status` from the `RuntimeBuilder` (#5430) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5430
- Fix spice chat error handling (#5433) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5433
- Add more test models to FinanceBench benchmark (#5431) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5431
- support 'from: databricks:model_name' (#5434) by @Jeadie in https://github.com/spiceai/spiceai/pull/5434
- Upgrade Pulls with Spice to v1.0.6 and add concurrency control (#5442) by @lukekim in https://github.com/spiceai/spiceai/pull/5442
- Upgrade DataFusion table providers (#5443) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5443
- Test spice chat in e2e_test_spice_cli (#5447) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5447
- Allow for one-shot chat request using `spice chat <message>` (#5444) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5444
- Enable parallel data sampling for NSQL (#5449) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5449
- Upgrade Go from v1.23.4 to v1.24.2 (#5462) by @lukekim in https://github.com/spiceai/spiceai/pull/5462
- Update PULL_REQUEST_TEMPLATE.md (#5465) by @lukekim in https://github.com/spiceai/spiceai/pull/5465
- Enable captured outputs by default when spiced is started by the CLI (spice run) (#5464) by @lukekim in https://github.com/spiceai/spiceai/pull/5464
- Parameterized queries via Flight SQL API (#5420) by @kczimm in https://github.com/spiceai/spiceai/pull/5420
- fix: Update benchmarks readme badge (#5466) by @peasee in https://github.com/spiceai/spiceai/pull/5466
- delay auth check for binding parameterized queries (#5475) by @kczimm in https://github.com/spiceai/spiceai/pull/5475
- Add support for `?` placeholder syntax in parameterized queries (#5463) by @kczimm in https://github.com/spiceai/spiceai/pull/5463
- enable task name override for non static span names (#5423) by @Jeadie in https://github.com/spiceai/spiceai/pull/5423
- Allow parameter queries with no parameters (#5481) by @kczimm in https://github.com/spiceai/spiceai/pull/5481
- Support unparsing UNION for distinct results (#5483) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5483
- add rust-toolchain.toml (#5485) by @kczimm in https://github.com/spiceai/spiceai/pull/5485
- Add parameterized query support to the HTTP API (#5484) by @kczimm in https://github.com/spiceai/spiceai/pull/5484
- E2E test for spice chat <message> behavior (#5451) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5451
- Renable and fix huggingface models integration tests (#5478) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5478
- Update openapi.json (#5488) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5488
- feat: Record memory usage as a metric (#5489) by @peasee in https://github.com/spiceai/spiceai/pull/5489
- fix: update dispatcher to run all benchmarks, rename metric, update spicepods, add scale factor (#5500) by @peasee in https://github.com/spiceai/spiceai/pull/5500
- Fix ILIKE filters support (#5502) by @ewgenius in https://github.com/spiceai/spiceai/pull/5502
- fix: Update test spicepod locations and names (#5505) by @peasee in https://github.com/spiceai/spiceai/pull/5505
- fix: Update benchmark snapshots (#5508) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5508
- fix: Update benchmark snapshots (#5512) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5512
- Fix Delta Lake bug for: Found unmasked nulls for non-nullable StructArray field "predicate" (#5515) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5515
- fix: working directory for duckdb e2e test spicepods (#5510) by @peasee in https://github.com/spiceai/spiceai/pull/5510
- Tweaks to README.md (#5516) by @lukekim in https://github.com/spiceai/spiceai/pull/5516
- Cache logical plans of SQL statements (#5487) by @kczimm in https://github.com/spiceai/spiceai/pull/5487
- Fix `content-type: application/json` (#5517) by @Jeadie in https://github.com/spiceai/spiceai/pull/5517
- Validate postgres results in testoperator dispatch (#5504) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5504
- fix: Update benchmark snapshots (#5511) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5511
- Fix results cache by SQL with prepared statements (#5518) by @kczimm in https://github.com/spiceai/spiceai/pull/5518
- Add initial support for views acceleration (#5509) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5509
- fix: Update benchmark snapshots (#5527) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5527
- Support switching the memory allocator Spice uses via `alloc-*` features. (#5528) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5528
- fix: Update benchmark snapshots (#5525) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5525
- Add test spicepod for tpch mysql-duckdb[file acceleration] (#5521) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5521
- Fix nightly arm build - change tag `-default` to `-models` (#5529) by @ewgenius in https://github.com/spiceai/spiceai/pull/5529
- LLM router via `worker` spicepod component (#5513) by @Jeadie in https://github.com/spiceai/spiceai/pull/5513
- Apply Spice advanced acceleration logic and params support to accelerated views (#5526) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5526
- Enable DatasetCheckpoint logic for accelerated views (#5533) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5533
- Fix public '.model' name for router workers (#5535) by @Jeadie in https://github.com/spiceai/spiceai/pull/5535
- feat: Add Runtime memory limit parameter (#5536) by @peasee in https://github.com/spiceai/spiceai/pull/5536
- For fallback worker, check first item in `chat/completion` stream. (#5537) by @Jeadie in https://github.com/spiceai/spiceai/pull/5537
- Move rate limit check to after parameterized query binding (#5540) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5540
- Update spicepod.schema.json (#5545) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5545
- Accelerate views: refresh_on_startup, ready_state, jitter params support (#5547) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5547
- Add integration test for accelerated views (#5550) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5550
- Don't install make or expect on spiceai-macos runners (#5554) by @lukekim in https://github.com/spiceai/spiceai/pull/5554
- `event_stream` crate for emitting events from tracing::Span; used in v1/chat/completions streaming. (#5474) by @Jeadie in https://github.com/spiceai/spiceai/pull/5474
- Fix typo in method (#5559) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5559
- Run test operator every day and current and previous commits (#5557) by @lukekim in https://github.com/spiceai/spiceai/pull/5557
- Add aws_allow_http parameter for delta lake connector (#5541) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5541
- feat: Add branch name to metric dimensions in testoperator (#5563) by @peasee in https://github.com/spiceai/spiceai/pull/5563
- fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/odbc[databricks].yaml (#5565) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5565
- fix: Split scheduled dispatch into a separate job (#5567) by @peasee in https://github.com/spiceai/spiceai/pull/5567
- fix: Use outputs.SPICED_COMMIT (#5568) by @peasee in https://github.com/spiceai/spiceai/pull/5568
- fix: Use refs in testoperator dispatch instead of commits (#5569) by @peasee in https://github.com/spiceai/spiceai/pull/5569
- fix: actions/checkout ref does not take a full ref (#5571) by @peasee in https://github.com/spiceai/spiceai/pull/5571
- fix: Testoperator dispatch (#5572) by @peasee in https://github.com/spiceai/spiceai/pull/5572
- Respect `update-snapshots` when running all benchmarks manually (#5577) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5577
- Use FETCH_HEAD instead of ${{ inputs.ref }} to list commits in setup_spiced (#5579) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5579
- Add additional test scenarios for benchmarks (#5582) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5582
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-duckdb[file].yaml (#5590) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5590
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/mysql-duckdb[file].yaml (#5591) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5591
- Fix Snowflake data connector rows ordering (#5599) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5599
- fix: Update benchmark snapshots (#5595) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5595
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-arrow.yaml (#5594) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5594
- fix: Update benchmark snapshots (#5589) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5589
- fix: Update benchmark snapshots (#5583) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5583
- Downgrade DuckDB to 1.1.3 (#5607) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5607
- Add prepared statement integration tests (#5544) by @kczimm in https://github.com/spiceai/spiceai/pull/5544

Full Changelog: v1.1.2...v1.2.0

Spice v1.1.0 (Mar 31, 2025)

ยท 20 min read
Luke Kim
Founder and CEO of Spice AI

Model-Context-Protocol (MCP) support in Spice.ai Open Source

Announcing the release of Spice v1.1.0! ๐Ÿค–

Spice v1.1.0 introduces full support for the Model-Context-Protocol (MCP), expanding how models and tools connect. Spice can now act as both an MCP Server, with the new /v1/mcp/sse API, and an MCP Client, supporting stdio and SSE-based servers. This release also introduces a new Web Search tool with Perplexity model support, advanced evaluation workflows with custom eval scorers, including LLM-as-a-judge, and adds an IMAP Data Connector for federated SQL queries across email servers. Alongside these features, v1.1.0 includes automatic NSQL query retries, expanded task tracing, request drains for HTTP server shutdowns, delivering improved reliability, flexibility, and observability.

Highlights in v1.1.0โ€‹

  • Spice as an MCP Server and Client: Spice now supports the Model Context Protocol (MCP), for expanded tool discovery and connectivity. Spice can:

    1. Run stdio-based MCP servers internally.
    2. Connect to external MCP servers over SSE protocol (Streamable HTTP is coming soon!)

    For more details, see the MCP documentation.

    Usageโ€‹

    tools:
    - name: google_maps
    from: mcp:npx
    params:
    mcp_args: -y @modelcontextprotocol/server-google-maps

    Spice as an MCP Serverโ€‹

    Tools in Spice can be accessed via MCP. For example, connecting from an IDE like Cursor or Windsurf to Spice. Set the MCP Server URL to http://localhost:8090/v1/mcp/sse.

  • Perplexity Model Support: Spice now supports Perplexity-hosted models, enabling advanced web search and retrieval capabilities. Example configuration:

    models:
    - name: webs
    from: perplexity:sonar
    params:
    perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }
    perplexity_search_domain_filter:
    - docs.spiceai.org
    - huggingface.co

    For more details, see the Perplexity documentation.

  • Web Search Tool: The new Web Search Tool enables Spice models to search the web for information using search engines like Perplexity. Example configuration:

    tools:
    - name: the_internet
    from: websearch
    description: 'Search the web for information.'
    params:
    engine: perplexity
    perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }

    For more details, see the Web Search Tool documentation.

  • Eval Scorers: Eval scorers assess model performance on evaluation cases. Spice includes built-in scorers:

    • match: Exact match.
    • json_match: JSON equivalence.
    • includes: Checks if actual output includes expected output.
    • fuzzy_match: Normalized subset matching.
    • levenshtein: Levenshtein distance.

    Custom scorers can use embedding models or LLMs as judges. Example:

    evals:
    - name: australia
    dataset: cricket_questions
    scorers:
    - hf_minilm
    - judge
    - match
    embeddings:
    - name: hf_minilm
    from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2
    models:
    - name: judge
    from: openai:gpt-4o
    params:
    openai_api_key: ${ secrets:OPENAI_API_KEY }
    system_prompt: |
    Compare these stories and score their similarity (0.0 to 1.0).
    Story A: {{ .actual }}
    Story B: {{ .ideal }}

    For more details, see the Eval Scorers documentation.

  • IMAP Data Connector: Query emails stored in IMAP servers using federated SQL. Example:

    datasets:
    - from: imap:[email protected]
    name: emails
    params:
    imap_access_token: ${secrets:IMAP_ACCESS_TOKEN}

    For more details, see the IMAP Data Connector documentation.

  • Automatic NSQL Query Retries: Failed NSQL queries are now automatically retried, improving reliability for federated queries. For more details, see the NSQL documentation.

  • Enhanced Task Tracing: Task history now includes chat completion IDs, and runtime readiness is traced for better observability. Use the runtime.task_history table to query task details. See the Task History documentation.

  • Vector Search with Keyword Filtering: The vector search API now includes an optional list of keywords as a parameter, to pre-filter SQL results before performing a vector search. When vector searching via a chat completion, models will automatically generate keywords relevant to the search. See the Vector Search API documentation.

  • Improved Refresh Behavior on Startup: Spice won't automatically refresh an accelerated dataset on startup if it doesn't need to. See the Refresh on Startup documentation.

  • Graceful Shutdown for HTTP Server: The HTTP server now drains requests for graceful shutdowns, ensuring smoother runtime termination.

New Contributors ๐ŸŽ‰โ€‹

Contributorsโ€‹

  • @sgrebnov
  • @phillipleblanc
  • @peasee
  • @Jeadie
  • @lukekim
  • @benrussell
  • @Sevenannn
  • @sergey-shandar
  • @Garamda
  • @johnnynunez

Breaking Changesโ€‹

No breaking changes.

Cookbook Updatesโ€‹

The Spice Cookbook now has 74 recipes that make it easy to get started with Spice!

Upgradingโ€‹

To upgrade to v1.1.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.0 image:

docker pull spiceai/spiceai:1.1.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

  • No major dependency changes.

Changelogโ€‹

- release: Bump chart, and versions for next release by @peasee in <https://github.com/spiceai/spiceai/pull/4464>
- feat: Schedule testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/4503>
- fix: Remove on zero results arguments from benchmarks by @peasee in <https://github.com/spiceai/spiceai/pull/4533>
- fix: Don't snapshot clickbench benchmarks by @peasee in <https://github.com/spiceai/spiceai/pull/4534>
- docs: v1.0.1 release note by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4529>
- Update acknowledgements by @github-actions in <https://github.com/spiceai/spiceai/pull/4535>
- In spiced_docker, propagate setup to publish-cuda by @Jeadie in <https://github.com/spiceai/spiceai/pull/4543>
- Upgrade Rust to 1.84 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4541>
- Upgrade dependencies by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4546>
- Revert "Use OpenAI golang client in `spice chat` (#4491)" by @Jeadie in <https://github.com/spiceai/spiceai/pull/4564>
- feat: add schema inference for the Spice.ai Data Connector by @peasee in <https://github.com/spiceai/spiceai/pull/4579>
- Remove 'tools: builtin' by @Jeadie in <https://github.com/spiceai/spiceai/pull/4607>
- feat: Add initial IMAP connector by @peasee in <https://github.com/spiceai/spiceai/pull/4587>
- feat: Add email content loading by @peasee in <https://github.com/spiceai/spiceai/pull/4616>
- feat: Add SSL and Auth parameters for IMAP by @peasee in <https://github.com/spiceai/spiceai/pull/4613>
- Change /v1/models to be OpenAI compatible by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4624>
- Use `pdf-extract` crate to extract text from PDF documents by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4615>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4628>
- Add 1.0.2 release notes by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4627>
- Fix cuda::ffi by @Jeadie in <https://github.com/spiceai/spiceai/pull/4649>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4654>
- fix: Spice.ai schema inference by @peasee in <https://github.com/spiceai/spiceai/pull/4674>
- Add SQL Benchmark with sample eval configuration based on TPCH by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4549>
- Update Helm chart to Spice v1.0.2 by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4655>
- Update v1.0.2 release notes by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4639>
- Fix E2E AI release install test on self-hosted runners (macos) by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4675>
- Main performance metrics calculation for Text to SQL Benchmark by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4681>
- Add eval datasets / test scripts for model grading criteria by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4663>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4684>
- Add testoperator for `evals` running by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4688>
- Add GH Workflow to run Text to SQL benchmark by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4689>
- Add 1.0.2 as supported version to SECURITY.md by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4695>
- Text-To-SQL benchmark: trace failed tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4705>
- Text-To-SQL benchmark: extend list of benchmarking models by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4707>
- Text-To-SQL: increase sql coverage, add more advanced tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4713>
- Use model that supports tools in hf_test by @Jeadie in <https://github.com/spiceai/spiceai/pull/4712>
- Fix Spice.ai E2E test by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4723>
- Return non-existing model for v1/chat endpoint by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4718>
- Update Helm chart for 1.0.3 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4742>
- Update dependencies by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4740>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4744>
- Update SECURITY.md with 1.0.3 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4745>
- Add basic smoke test of perplexity LLM to llm integration tests. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4735>
- Don't run integration tests on PRs when only CLI is changed by @Jeadie in <https://github.com/spiceai/spiceai/pull/4751>
- Prompt user to upgrade through brew / do another clean install when spice is installed through homebrew / at non-standard path by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4746>
- feat: Search with keyword filtering by @peasee in <https://github.com/spiceai/spiceai/pull/4759>
- Fix search benchmark by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4765>
- feat: Add IMAP access token parameter by @peasee in <https://github.com/spiceai/spiceai/pull/4769>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4774>
- Mark trunk builds as unstable by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4776>
- feat: Release Spice.ai RC by @peasee in <https://github.com/spiceai/spiceai/pull/4753>
- fix: Validate columns and keywords in search by @peasee in <https://github.com/spiceai/spiceai/pull/4775>
- Run models E2E tests on PR by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4798>
- fix: models runtime not required for cloud chat by @peasee in <https://github.com/spiceai/spiceai/pull/4781>
- Only open one PR for openapi.json by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4807>
- docs: Release IMAP Alpha by @peasee in <https://github.com/spiceai/spiceai/pull/4797>
- Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4809>
- Initial spice cli e2e tests with spice upgrade tests by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4764>
- Log CLI and Runtime Versions on startup by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4816>
- Sort keys for openai by @Jeadie in <https://github.com/spiceai/spiceai/pull/4766>
- Remove docs index trigger from the endgame template by @ewgenius in <https://github.com/spiceai/spiceai/pull/4832>
- Release notes for v1.0.4 by @Jeadie in <https://github.com/spiceai/spiceai/pull/4827>
- Update SECURITY.md by @Jeadie in <https://github.com/spiceai/spiceai/pull/4829>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4831>
- Don't print URL by @lukekim in <https://github.com/spiceai/spiceai/pull/4838>
- add 'eval_run' to 'spice trace' by @Jeadie in <https://github.com/spiceai/spiceai/pull/4841>
- Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4843>
- Fix 'actual" and "output" columns in `eval.results`. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4835>
- Fix string escaping of system prompt by @Jeadie in <https://github.com/spiceai/spiceai/pull/4844>
- update helm chart to v1.0.4 by @Jeadie in <https://github.com/spiceai/spiceai/pull/4828>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4806>
- fix: Skip sccache in PR for external users by @peasee in <https://github.com/spiceai/spiceai/pull/4851>
- fix: Return BAD_REQUEST when not embeddings are configured by @peasee in <https://github.com/spiceai/spiceai/pull/4804>
- Debug log cuda detection failure in spice by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4852>
- fix: Set RUSTC wrapper explicitly by @peasee in <https://github.com/spiceai/spiceai/pull/4854>
- Improve trace UX for `ai_completion`, fix infinite tool calls by @Jeadie in <https://github.com/spiceai/spiceai/pull/4853>
- Allow homebrew spice cli to upgrade the runtime by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4811>
- Add support for MCP tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/4808>
- fix: Rustc wrapper actions by @peasee in <https://github.com/spiceai/spiceai/pull/4867>
- Provide link to supported OS list when user platform is not supported by @Garamda in <https://github.com/spiceai/spiceai/pull/4840>
- Always download spice runtime version matched with spice cli version by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4761>
- Disable flaky integration test by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4871>
- fix: sccache actions setup by @peasee in <https://github.com/spiceai/spiceai/pull/4873>
- Fixing Go installation in the setup script for Linux Arm64 by @sergey-shandar in <https://github.com/spiceai/spiceai/pull/4868>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4864>
- DuckDB acceleration: Use temp table only for append with conflict resolution by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4874>
- Trace the output of streamed `chat/completions` to runtime.task_history. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4845>
- Always pass `X-API-Key` in spice api calls header if detected in env by @ewgenius in <https://github.com/spiceai/spiceai/pull/4878>
- Revert "DuckDB acceleration: Use temp table only for append with conflict resolution" by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4886>
- Allow overriding spicerack base url in the CLI by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4892>
- Add test Spicepod for DuckDB full acceleration with constraints by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4891>
- Refactor Parameter Handling by @Advayp in <https://github.com/spiceai/spiceai/pull/4833>
- Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4898>
- Update to latest async-openai fork. Update secrecy by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4911>
- Fix mcp tools build by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4916>
- Add more test spicepods by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4923>
- task: Add more dispatch files by @peasee in <https://github.com/spiceai/spiceai/pull/4933>
- run spiceai benchmark test using test operator by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4920>
- Convert sequential search code block to parallel async by @Garamda in <https://github.com/spiceai/spiceai/pull/4936>
- fix: Throughput metric calculation by @peasee in <https://github.com/spiceai/spiceai/pull/4938>
- Update dependabot dependencies & `cargo update` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4872>
- Improve servers shutdown sequence during runtime termination by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4942>
- Semantic model for views. Views visible in `table_schema` & `list_datasets` tools. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4946>
- update openai-async by @Jeadie in <https://github.com/spiceai/spiceai/pull/4948>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4961>
- fix: Redundant results snapshotting by @peasee in <https://github.com/spiceai/spiceai/pull/4956>
- Create schema for views if not exist by @Jeadie in <https://github.com/spiceai/spiceai/pull/4957>
- Bump Jimver/cuda-toolkit from 0.2.21 to 0.2.22 by @dependabot in <https://github.com/spiceai/spiceai/pull/4969>
- List available operations in `spice trace <operation>` by @Jeadie in <https://github.com/spiceai/spiceai/pull/4953>
- Initial commit of release analytics by @lukekim in <https://github.com/spiceai/spiceai/pull/4975>
- Remove spaces from CSV by @lukekim in <https://github.com/spiceai/spiceai/pull/4977>
- Fix Spice pods watcher by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4984>
- feat: Add appendable data sources for the testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/4949>
- Omit timestamp when warning regarding datasets with hyphens by @Advayp in <https://github.com/spiceai/spiceai/pull/4987>
- Update helm chart to v1.0.5 by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4990>
- docs: Update qa_analytics.csv by @peasee in <https://github.com/spiceai/spiceai/pull/4989>
- Update end_game template by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4991>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4993>
- Add v1.0.5 release notes by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4994>
- Supported Versions: include v1.0.5 by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4995>
- Dependabot updates by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4992>
- Switch to basic markdown formatting for vector search by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4934>
- docs: Update qa_analytics.csv by @peasee in <https://github.com/spiceai/spiceai/pull/5001>
- feat: Add TPCDS FileAppendableSource for testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/5002>
- Update `ring` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5003>
- docs: Update qa_analytics.csv by @peasee in <https://github.com/spiceai/spiceai/pull/5006>
- feat: Add ClickBench FileAppendableSource for testoperator by @peasee in <https://github.com/spiceai/spiceai/pull/5004>
- feat: Validate append test table counts by @peasee in <https://github.com/spiceai/spiceai/pull/5008>
- feat: Add append spicepods by @peasee in <https://github.com/spiceai/spiceai/pull/5009>
- Improve Vector Search performance for large content w/o primary key defined by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5010>
- Don't try to downgrade Arc in test_acceleration_duckdb_single_instance by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5014>
- feat: Add an initial testoperator vector search command by @peasee in <https://github.com/spiceai/spiceai/pull/5011>
- feat: Update testoperator workflows for automatic snapshot updates by @peasee in <https://github.com/spiceai/spiceai/pull/5018>
- Fix Vector Search when additional columns include embedding column by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5022>
- Include test for primary key passed as additional column in Vector Search by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5024>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5020>
- upgrade mistral.rs by @Jeadie in <https://github.com/spiceai/spiceai/pull/4952>
- fix: Indexes for TPCDS SQLite Spicepod by @peasee in <https://github.com/spiceai/spiceai/pull/5038>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5035>
- Include local files in generated Spicepod package by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5041>
- update mistral.rs to 'spiceai' branch rev by @Jeadie in <https://github.com/spiceai/spiceai/pull/5029>
- Configure spiced as an MCP SSE server by @Jeadie in <https://github.com/spiceai/spiceai/pull/5039>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5052>
- fix: Disable benchmarks schedule, enable testoperator schedule by @peasee in <https://github.com/spiceai/spiceai/pull/5058>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5060>
- Update ROADMAP.md March 2025 by @lukekim in <https://github.com/spiceai/spiceai/pull/5061>
- fix: Testoperator data setup by @peasee in <https://github.com/spiceai/spiceai/pull/5068>
- fix: All HTTP endpoints to hang when adding an invalid dataset with --pods-watcher-enabled by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5050>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5073>
- Integration tests for MCP tooling by @Jeadie in <https://github.com/spiceai/spiceai/pull/5053>
- OpenAPI docs for MCP by @Jeadie in <https://github.com/spiceai/spiceai/pull/5057>
- fix: Acceleration federation test by @peasee in <https://github.com/spiceai/spiceai/pull/5090>
- fix: Allow spiced commit in testoperator dispatch by @peasee in <https://github.com/spiceai/spiceai/pull/5098>
- fix: Use RefreshOverrides for the refresh API definition by @peasee in <https://github.com/spiceai/spiceai/pull/5095>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5094>
- fix: Increase tries for refresh_status_change_to_ready test by @peasee in <https://github.com/spiceai/spiceai/pull/5099>
- feat: Testoperator reports on max and median memory usage by @peasee in <https://github.com/spiceai/spiceai/pull/5101>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5105>
- fix: Fail testoperator on failed queries by @peasee in <https://github.com/spiceai/spiceai/pull/5106>
- Update Helm chart to 1.0.6 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5107>
- Update SECURITY.md to include 1.0.6 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5109>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5108>
- Add QA analytics for 1.0.6 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5110>
- add env variables to tools, usable in MCP stdio by @Jeadie in <https://github.com/spiceai/spiceai/pull/5097>
- HF downloads obey SIGTERM by @Jeadie in <https://github.com/spiceai/spiceai/pull/5044>
- Add v1.0.6 release notes into trunk by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5111>
- Remove redundant mod name for iceberg integration tests by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5112>
- Use fixed data directory for test operator by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5103>
- Improvements for evals by @Jeadie in <https://github.com/spiceai/spiceai/pull/5040>
- Make McpProxy trait for MCP passthrough by @Jeadie in <https://github.com/spiceai/spiceai/pull/5115>
- Properly handle '/' for tool names. by @Jeadie in <https://github.com/spiceai/spiceai/pull/5116>
- Use retry logic when loading tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/5120>
- Exclude slow tests from regular pr runs by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5119>
- Fix test operator snapshot update by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5130>
- spice init: Fixes windows bug where full path is used for spicepod name by @benrussell in <https://github.com/spiceai/spiceai/pull/5126>
- fix: Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/5131>
- Implement graceful shutdown for HTTP server by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5102>
- Update enhancement.md by @lukekim in <https://github.com/spiceai/spiceai/pull/5142>
- Add GitHub Workflow and PoC Spicepod configuration to run FinanceBench tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5145>
- Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5155>
- De-duplicate attachments in DuckDBAttachments by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5156>
- v1.0.7 release note by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5153>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/5160>
- Update Helm chart to 1.0.7 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5159>
- Add github token to macos test release download tasks by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5161>
- update security.md for 1.0.7 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5162>
- Update roadmap.md by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5163>
- Add a performance comparison section for 1.0.7 by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/5164>
- docs: Add snafu error variant point to style guide by @peasee in <https://github.com/spiceai/spiceai/pull/5167>
- Fix 1.0.7 release note by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5168>
- Adjust DuckDB connection pool size based on DuckDB accelerator instances usage by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5117>
- Add automatic retry for NSQL queries by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5169>
- Include chat completion id to task history by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5170>
- Trace when all runtime components are ready by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5171>
- Update qa_analytics.csv for 1.0.7 by @Sevenannn in <https://github.com/spiceai/spiceai/pull/5165>
- Set default tool recursion limit to 10 to prevent infinite loops by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5173>
- Add support for `schema_source_path` param for object-store data connectors by @sgrebnov in <https://github.com/spiceai/spiceai/pull/5178>
- Run license check and check changes on self-hosted macOS runners by @lukekim in <https://github.com/spiceai/spiceai/pull/5179>
- Add MCP by @lukekim in <https://github.com/spiceai/spiceai/pull/5183>

Full Changelog: github.com/spiceai/spiceai/compare/v1.0.0...release/1.1

Spice v1.0.7 (Mar 26, 2025)

ยท 4 min read
Phillip LeBlanc
Co-Founder and CTO of Spice AI

Announcing the release of Spice v1.0.7 ๐ŸŽ๏ธ

Spice v1.0.7 improves memory usage when using DuckDB, improves schema inference performance when using object-store based data connectors, and fixes a bug in Dremio schema inference.

Highlights in v1.0.7โ€‹

  • DuckDB Memory Usage: Memory usage when using DuckDB has been significantly improved for data loads and refreshes through expanded use of zero-copy Arrow and multi-threading for data loads. When a duckdb_memory_limit is specified, disk spilling has been improved for greater-than-memory workloads. In addition, a new temp_directory runtime parameter supports storing temporary files to alternative location than the DuckDB data file for higher throughput. For example, temp_directory could be set to a different high-IOPs IO2 EBS volume that is separate from the duckdb_file_path.

    Automated end-to-end tests for the DuckDB Accelerator coverage has been significantly expanded.

    For configuration details, see the documentation for runtime parameters and the DuckDB Data Accelerator.

  • Schema Inference Performance for Object-Store Data Connectors: Schema inference performance has been improved, especially for large numbers of objects (1M+ objects) when using object-store based data connectors by making the object-listing and selection more efficient.

Performanceโ€‹

When compared to previous versions, Spice v1.0.7 loads DuckDB accelerated datasets significantly faster. When using the TPCH lineitem dataset at Scale Factor 100 (600M rows):

Without Indexesโ€‹

5x faster, 28% less memory usage.

v1.0.6 v1.0.7

VersionLoad TimePeak Memory Usage
v1.0.616m 3s32GB
v1.0.73m 149ms24.4GB

With Indexesโ€‹

2.5x faster. Higher memory usage in v1.0.7 is due to better resource utilization to achieve faster load times. Use the duckdb_memory_limit parameter to control memory usage.

VersionLoad TimePeak Memory Usage
v1.0.627m 9s50GB
v1.0.711m 30s77GB

v1.0.6 with indexes v1.0.7 with indexes

Documentationโ€‹

  • DuckDB Data Accelerator: Has been expanded with additional resource usage guidance.
  • Memory: A new section for memory considerations has been added to the Reference section.

Contributorsโ€‹

  • @phillipleblanc
  • @sgrebnov
  • @peasee
  • @Sevenannn

Breaking Changesโ€‹

No breaking changes.

Upgradingโ€‹

To upgrade to v1.0.7, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.7 image:

docker pull spiceai/spiceai:1.0.7

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Changelogโ€‹

- fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
- Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
- fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
- Fix Dremio schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5114
- Improve performance of schema inference for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5124
- Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
- Fix go lint errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5147
- Make DuckDB acceleration E2E tests more comprehensive by @sgrebnov in https://github.com/spiceai/spiceai/pull/5146
- Enable Spice to load larger than memory datasets into DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5149
- Add `temp_directory` runtime parameter and insert it for DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5152
- Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
- Enable E2E for DuckDB full mode acceleration with indexes only in CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/5154

Full Changelog: github.com/spiceai/spiceai/compare/v1.0.6...v1.0.7

Spice v1.0.5 (Mar 11, 2025)

ยท 4 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.0.5 ๐ŸงŠ

Spice v1.0.5 expands Iceberg support with the introduction of the Iceberg Data Connector, in addition to the existing Iceberg Catalog Connector. This new connector enables direct dataset creation and configuration for specific Iceberg objects, enabling federated and accelerated SQL queries on Apache Iceberg tables.

Performance improvements include object-store optimized Parquet pruning in append mode, where object-store metadata is now leveraged alongside Hive partitioning to optimize file pruning. This results in faster and more efficient queries.

DuckDB has been upgraded to v1.2.0, along with additional stability improvements, including improved graceful shutdown and the ability to configure the DuckDB memory limit.

Additional updates include support for the Arrow Map type.

Highlights in v1.0.5โ€‹

  • New Iceberg Data Connector: Enables direct dataset creation and querying of Iceberg tables.

    Example usage in spicepod.yaml:

    datasets:
    - from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table
    name: my_table
    params:
    # Same as Iceberg Catalog Connector
    acceleration:
    enabled: true

    For detailed setup instructions, authentication options, and configuration parameters, refer to the Iceberg Data Connector documentation.

  • Improved Parquet pruning in append mode: Uses object-store metadata for more efficient file pruning.

  • DuckDB upgrade to v1.2.0 with improved graceful shutdown: Read the DuckDB v1.2.0 announcement for details, including breaking changes for map and list_reduce. Graceful shutdown of DuckDB has been improved for better stability across restarts.

  • Configurable DuckDB memory limit: Use the duckdb_memory_limit parameter to set the DuckDB acceleration memory limit:

    - from: spice.ai:path.to.my_dataset
    name: my_dataset
    acceleration:
    params:
    duckdb_memory_limit: '2GB'
    enabled: true
    engine: duckdb
    mode: file

Contributorsโ€‹

  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @lukekim

Breaking Changesโ€‹

Upgradingโ€‹

To upgrade to v1.0.5, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.5 image:

docker pull spiceai/spiceai:1.0.5

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

What's Changedโ€‹

Dependenciesโ€‹

Changelogโ€‹

  • fix: Update OpenAI model health check by @peasee in #4849
  • fix: Allow metrics endpoint setting in CLI by @peasee in #4939
  • DuckDB acceleration: fix Decimal with zero scale support by @sgrebnov in #4922
  • Introduce runtime shutdown state by @sgrebnov in #4917
  • Add support for Flight and HTTP endpoints configuration to Spice CLI (run and sql) by @sgrebnov and @lukekim in #4913
  • Fix Datafusion resources deallocation during shutdown by @sgrebnov in #4912
  • DuckDB: fix error handling during record batch insertion by @sgrebnov in #4894
  • DuckDB: add support for Map Arrow type for DuckDB acceleration by @sgrebnov in #4887
  • Upgrade to DuckDB v1.2.0 by @sgrebnov in #4842
  • Gracefully shutdown the runtime and deallocate static resources by @sgrebnov in #4879
  • Implement an Iceberg Data Connector by @phillipleblanc in #4941
  • Don't trace canceled dataset refresh during runtime termination by @sgrebnov in #4958
  • Use metadata column last_modified when specified as a time_column by @phillipleblanc in #4970
  • Add duckdb_memory_limit param support for DuckDB acceleration by @sgrebnov in #4971
  • Add Iceberg dataset integration test by @phillipleblanc in #4950

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.4...v1.0.5

Spice v1.0-stable (Jan 20, 2025)

ยท 11 min read
William Croxson
Senior Software Engineer at Spice AI

๐ŸŽ‰ After 47 releases, Spice.ai OSS has reached production readiness with the 1.0-stable milestone!

The core runtime and features such as query federation, query acceleration, catalog integration, search and AI-inference have all graduated to stable status along with key component graduations across data connectors, data accelerators, catalog connectors, and AI model providers.

Highlights in v1.0-stableโ€‹

Breaking Changesโ€‹

  • Default Runtime Version: The CLI will install the GPU accelerated AI-capable Runtime by default (if supported), when running spice install or spice run. To force-install the non-GPU version, run spice install ai --cpu.

  • Default OpenAI Model: The default OpenAI model has updated to gpt-4o-mini.

  • Identifier Normalization: Unquoted identifiers such as table names are no longer normalized to lowercase. Identifiers will now retain their exact case as provided.

  • Sandboxed Docker Image: The Runtime Docker Image now runs the spiced process as the nobody user in a minimal chroot sandbox.

  • Insecure S3 and ABFS endpoints: The S3 and ABFS connectors now enforce insecure endpoint checks, preventing HTTP endpoints unless allow_http is explicitly enabled. Refer to the documentation for details.

Dependenciesโ€‹

No major dependency changes.

Upgradingโ€‹

To upgrade to v1.0.0, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.0 image:

docker pull spiceai/spiceai:1.0.0

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

Contributorsโ€‹

  • @peasee
  • @ewgenius
  • @Jeadie
  • @Sevenannn
  • @lukekim
  • @phillipleblanc
  • @sgrebnov

What's Changedโ€‹

- feat: Update load test criteria, testoperator updates by @peasee in <https://github.com/spiceai/spiceai/pull/4311>
- Update helm for v1.0.0-rc.5 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4313>
- Update spicepod.schema.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4318>
- Bump version to v1.0.0, update SECURITY.md by @ewgenius in <https://github.com/spiceai/spiceai/pull/4314>
- Initial criteria for models, embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4223>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4321>
- Add dremio param for running load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4315>
- Promote Databricks (mode: delta_lake) connector to stable by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4328>
- Handle failed query in load test by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4327>
- feat: Use load test hours for baseline query sets by @peasee in <https://github.com/spiceai/spiceai/pull/4334>
- Fix typo in 1.0.0-rc.5 release notes by @ewgenius in <https://github.com/spiceai/spiceai/pull/4329>
- feat: add testoperator data consistency by @peasee in <https://github.com/spiceai/spiceai/pull/4319>
- docs: Release DuckDB connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4335>
- Fix DocumentDB -> DynamoDB by @lukekim in <https://github.com/spiceai/spiceai/pull/4339>
- Update benchmark snapshots by @github-actions in <https://github.com/spiceai/spiceai/pull/4337>
- fix: Download hits.parquet from MinIO for benchmark by @peasee in <https://github.com/spiceai/spiceai/pull/4338>
- Update openapi.json by @github-actions in <https://github.com/spiceai/spiceai/pull/4341>
- Remove evil averages by @lukekim in <https://github.com/spiceai/spiceai/pull/4343>
- Don't run builds on non-code changes by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4344>
- Remove streaming requirement from Databricks spark Beta and Spark connector Beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4345>
- Update s3 tpcds spicepods by @ewgenius in <https://github.com/spiceai/spiceai/pull/4346>
- Explicitly set required scale factor for throughput and load tests by @ewgenius in <https://github.com/spiceai/spiceai/pull/4347>
- Fix s3 tpcds dataset name by @ewgenius in <https://github.com/spiceai/spiceai/pull/4348>
- Promote Iceberg Catalog Connector to Beta by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4350>
- Update s3 clickbench benchmark snapshots by @ewgenius in <https://github.com/spiceai/spiceai/pull/4351>
- fix: DuckDB clickbench on zero results by @peasee in <https://github.com/spiceai/spiceai/pull/4349>
- Add integration test with snapshots for databricks catalog connector by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4353>
- refactor: Remove on zero results from benchmarks, add data consistency workflow by @peasee in <https://github.com/spiceai/spiceai/pull/4354>
- Fix Bug: No field named body_embedding when do vector search with refresh sql containing subset of columns by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4297>
- docs: Update roadmap by @peasee in <https://github.com/spiceai/spiceai/pull/4364>
- feat: Release accelerators stable by @peasee in <https://github.com/spiceai/spiceai/pull/4361>
- Add TPCH/TPCDS test spicepods for MySQL by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4365>
- Catch when an insecure (http) S3 and ABFS data connectors endpoint is used without specifying the `allow_http` parameter by @ewgenius in <https://github.com/spiceai/spiceai/pull/4363>
- Update ROADMAP - Iceberg catalog alpha for v1.0 by @ewgenius in <https://github.com/spiceai/spiceai/pull/4367>
- Promote databricks catalog and databricks (spark_connect) connector to beta by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4369>
- Update Roadmap - Iceberg beta by @ewgenius in <https://github.com/spiceai/spiceai/pull/4373>
- Build CUDA binaries for Linux by @Jeadie in <https://github.com/spiceai/spiceai/pull/4320>
- Promote Nvidia NIM as Alpha by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4380>
- Promote xai to alpha by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4381>
- Update stable criteria for object store based connectors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4383>
- Testoperator: http consistency and overhead tests, fixes and ci by @ewgenius in <https://github.com/spiceai/spiceai/pull/4382>
- Promote S3 Data Connector to Stable by @ewgenius in <https://github.com/spiceai/spiceai/pull/4385>
- Download platform-supported CUDA binary version on Linux by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4356>
- Fix http consistency test workflow, add overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4387>
- feat: Add Postgres test spicepods by @peasee in <https://github.com/spiceai/spiceai/pull/4388>
- Fix typos + specific in model criteria; Make explicit alpha/beta tests for LLMS in `crates/llms/tests`. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4377>
- Fix federation bug for correlated subqueries of deeply nested Dremio tables by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4389>
- Fix http overhead workflow by @ewgenius in <https://github.com/spiceai/spiceai/pull/4390>
- Tweak model tests, fix embedding input by @ewgenius in <https://github.com/spiceai/spiceai/pull/4391>
- Promote Dremio to Stable quality by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4392>
- Add beta functionality tests for embedding models. by @Jeadie in <https://github.com/spiceai/spiceai/pull/4352>
- docs: Release postgres connector stable by @peasee in <https://github.com/spiceai/spiceai/pull/4398>
- Increase timeout for model response in E2E tests by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4399>
- Disable ident normalization (i.e. `SELECT MyColumn from table` works) by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4400>
- Preserve schema metadata by @ewgenius in <https://github.com/spiceai/spiceai/pull/4402>
- Make models integration tests tracing less verbose by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4403>
- Fix `cuda` feature build on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4404>
- Promote MySQL to Stable by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4406>
- docs: Release Delta Lake and Unity catalog by @peasee in <https://github.com/spiceai/spiceai/pull/4405>
- Use `gpt-4o-mini` as a default model for openai provider by @ewgenius in <https://github.com/spiceai/spiceai/pull/4410>
- Fix streaming for Openai and Anthropic by @Jeadie in <https://github.com/spiceai/spiceai/pull/4409>
- Tweak model loading and missing tool errors messages by @ewgenius in <https://github.com/spiceai/spiceai/pull/4412>
- Spice CLI: fallback to CPU build for unsupported GPU Compute Capability by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4407>
- Build Windows CUDA binaries as part of `build_and_release` workflow by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4386>
- Update docs link by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4416>
- feat: Add CPU models install escape hatch by @peasee in <https://github.com/spiceai/spiceai/pull/4419>
- Handle OpenAI API Errors by @ewgenius in <https://github.com/spiceai/spiceai/pull/4417>
- Update spice cli to use `GH_TOKEN` or `GITHUB_TOKEN` env variables when calling releases api by @ewgenius in <https://github.com/spiceai/spiceai/pull/4175>
- Implement secure sandboxing for Docker image by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4411>
- Automatically install supported CUDA binary on Windows by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4420>
- Metrics for LLMs+ embeddings by @Jeadie in <https://github.com/spiceai/spiceai/pull/4418>
- Jeadie/25 01 17/beta perf by @Jeadie in <https://github.com/spiceai/spiceai/pull/4397>
- Pass GitHub token to all CI steps calling spice run by @ewgenius in <https://github.com/spiceai/spiceai/pull/4423>
- Run the models integration tests on PRs by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4421>
- Run CUDA builds in a separate workflow by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4430>
- Promote OpenAI models and embeddings providers to RC by @ewgenius in <https://github.com/spiceai/spiceai/pull/4432>
- Update link to retrieval-augmented generation (RAG) details by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4433>
- Unity catalog should strip parameter prefix before passing parameters to delta lake factory by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4436>
- Update quickstart traces to match current version by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4435>
- Update Supported Embeddings Providers Readme section by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4434>
- Local models can stream tools by @Jeadie in <https://github.com/spiceai/spiceai/pull/4429>
- fix: Use MetricsCollector::show() for HTTP testoperator commands by @peasee in <https://github.com/spiceai/spiceai/pull/4442>
- Fix run query action by @ewgenius in <https://github.com/spiceai/spiceai/pull/4444>
- Default to AI-enabled runtime for `spice run`/`spice install` by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4443>
- Change no spicepod.yaml log to warning by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4447>
- refactor: Update Catalog Connector error messages by @peasee in <https://github.com/spiceai/spiceai/pull/4441>
- Fix panic when converting OTel metrics by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4449>
- refactor: Update model errors by @peasee in <https://github.com/spiceai/spiceai/pull/4446>
- Update spiceai/mistral.rs to silence metadata logs by @ewgenius in <https://github.com/spiceai/spiceai/pull/4452>
- fix xAI; don't use openai defaults by @Jeadie in <https://github.com/spiceai/spiceai/pull/4450>
- Improves the UX of using huggingface models by @phillipleblanc in <https://github.com/spiceai/spiceai/pull/4451>
- Add GH Workflow to test `spice ai` runtime installation by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4448>
- fix: Use specific model errors where available by @peasee in <https://github.com/spiceai/spiceai/pull/4454>
- Detect and report unsupported embedding column type during dataset registration by @sgrebnov in <https://github.com/spiceai/spiceai/pull/4456>
- Handle Errors by @Jeadie in <https://github.com/spiceai/spiceai/pull/4455>
- Catch and report negative openai_temperature error by @Sevenannn in <https://github.com/spiceai/spiceai/pull/4453>
- Clarify release check error message if it is caused by wrong GH token by @ewgenius in <https://github.com/spiceai/spiceai/pull/4458>

**Full Changelog**: <https://github.com/spiceai/spiceai/compare/v1.0.0-rc.5...v1.0.0>

Resourcesโ€‹

Communityโ€‹

Spice.ai started with the vision to make AI easy for developers. We are building Spice.ai in the open and with the community. Reach out on Slack or by email to get involved.