Kevin Zimmerman

Principal Software Engineer at Spice AI

View all authors

Spice v1.7.1 (Sep 29, 2025)

September 30, 2025 · 6 min read

Kevin Zimmerman

Principal Software Engineer at Spice AI

Announcing the release of Spice v1.7.1! 🔍

Spice v1.7.1 is a patch release focused on search improvements, bug fixes, and performance enhancements. This release introduces the Reciprocal Rank Fusion (RRF) user-defined table function (UDTF) for hybrid search, improves vector and text search reliability, and resolves several issues across the runtime, connectors, and query engine.

What's New in v1.7.1

Reciprocal Rank Fusion (RRF) UDTF: Spice now supports Reciprocal Rank Fusion (RRF) as a user-defined table function, enabling advanced hybrid search scenarios that combine results from multiple search methods (e.g., vector and text search) for improved relevance ranking.

Features:

Multi-search fusion: Combine results from vector_search, text_search, and other search UDTFs in a single query.
Advanced tuning: Per-query ranking weights, recency boosting, and configurable decay functions.
Performance: Optional user-specified join key for optimal performance.
Automatic joining: Falls back to on-the-fly JOIN key computation when no explicit key is provided.

Example usage:

SELECT id, title, content, fused_score
FROM rrf(
  vector_search(documents, 'machine learning algorithms', rank_weight => 1.5),
  text_search(documents, 'neural networks deep learning', rank_weight => 1.2),
  join_key => 'id',    -- optional join key for optimal performance
  k => 60.0            -- optional smoothing factor
)
WHERE fused_score > 0.01
ORDER BY fused_score DESC;

Learn more in the RRF documentation.

Acceleration Refresh Metrics: Spice now exposes additional Prometheus metrics that provide detailed observability into dataset acceleration refreshes. These metrics help monitor data freshness and ingestion lag for accelerated datasets with a time column.

Reported metrics:

Metric Name	Description
`dataset_acceleration_max_timestamp_before_refresh_ms`	Maximum value of the dataset's time column before refresh (milliseconds).
`dataset_acceleration_max_timestamp_after_refresh_ms`	Maximum value of the dataset's time column after refresh (milliseconds).
`dataset_acceleration_refresh_lag_ms`	Difference between max timestamp after and before refresh (milliseconds).
`dataset_acceleration_ingestion_lag_ms`	Lag between current wall-clock time and max timestamp after refresh (milliseconds).

These metrics are emitted during each acceleration refresh and can be scraped by Prometheus for monitoring and alerting. For more details, see the Observability documentation.

Bug Fixes & Improvements

This release resolves several issues and improves reliability across search, connectors, and query planning:

Full-Text Search (FTS): Ensure FTS metadata columns can be used in projection, fix JOIN-level filters not having columns in schema, and adds support for persistent file-based FTS indexes. Default limit of 1000 results if no limit specified.
Vector Search: Default limit of 1000 results if no limit specified, and fix removing embedding column.
Databricks SQL Warehouse: Improved error handling and support for async queries.
Other: Fixes for Anthropic model regex validation, tweaked AI-model health checks, and improved error messages.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Added Hybrid-Search using RRF - Combine results from multiple search methods (vector and text search) using Reciprocal Rank Fusion for improved relevance ranking.

The Spice Cookbook includes 78 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.7.1, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.7.1 image:

docker pull spiceai/spiceai:1.7.1

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is now available in the AWS Marketplace!

What's Changed

Changelog

ensure FTS metadata columns can be used in projection (#7282) by @Jeadie in #7282
Fix JOIN level filters not having columns in schema (#7287) by @Jeadie in #7287
Use file-based fts index (#7024) by @Jeadie in #7024
Remove 'PostApplyCandidateGeneration' (#7288) by @Jeadie in #7288
RRF: Rank and recency boosting (#7294) by @mach-kernel in #7294
RRF: Preserve base ranking when results differ -> FULL OUTER JOIN does not produce time column (#7300) by @mach-kernel in #7300
fix removing embedding column (#7302) by @Jeadie in #7302
RRF: Fix decay for disjoint result sets (#7305) by @mach-kernel in #7305
RRF: Project top scores, do not yield duplicate results (#7306) by @mach-kernel in #7306
RRF: Case sensitive column/ident handling (#7309) by @mach-kernel in #7309
For vector_search, use a default limit of 1000 if no limit specified (#7311) by @lukekim in #7311
Fix Anthropic model regex and add validation tests (#7319) by @ewgenius in #7319
Enhancement: Implement before/after/lag metrics for acceleration refresh (#7310) by @krinart in #7310
Refactor chat model health check to lower tokens usage for reasoning models (#7317) by @ewgenius in #7317
Enable chunking in SearchIndex (#7143) by @Jeadie in #7143
Use logical plan in SearchQueryProvider. (#7314) by @Jeadie in #7314
FTS max search results 100 -> 1000 (#7331) by @Jeadie in #7331
Improve Databricks SQL Warehouse Error Handling (#7332) by @sgrebnov in #7332
use spicepod embedding model name for 'model_name' (#7333) by @Jeadie in #7333
Handle async queries for Databricks SQL Warehouse API (#7335) by @phillipleblanc in #7335
RRF: Fix ident resolution for struct fields, autohashed join key for varying types (#7339) by @mach-kernel in #7339

Spice v1.5.2 (Aug 11, 2025)

August 12, 2025 · 7 min read

Kevin Zimmerman

Principal Software Engineer at Spice AI

Announcing the release of Spice v1.5.2! 🛠️

Spice v1.5.2 introduces a new Amazon Bedrock Models Provider for converse API (Nova) compatible models, AWS Redshift support using the Postgres data connector, and Hadoop Catalog Support for Iceberg tables along with several bug fixes and improvements.

What's New in v1.5.2

Amazon Bedrock Models Provider: Adds a new Amazon Bedrock LLM Provider. Models compatible with the Converse API (Nova) are supported.

Amazon Bedrock provides access to a range of foundation models for generative AI. Spice supports using Bedrock-hosted models by specifying the bedrock prefix in the from field and configuring the required parameters.

Supported Model IDs:

amazon.nova-lite-v1:0
amazon.nova-micro-v1:0
amazon.nova-premier-v1:0
amazon.nova-pro-v1:0

Refer to the Amazon Bedrock documentation for details on available models and cross-region inference profiles.

Example Spicepod.yaml:

models:
  - from: bedrock:us.amazon.nova-lite-v1:0
    name: novash
    params:
      aws_region: us-east-1
      aws_access_key_id: ${ secrets:AWS_ACCESS_KEY_ID }
      aws_secret_access_key: ${ secrets:AWS_SECRET_ACCESS_KEY }
      bedrock_guardrail_identifier: arn:aws:bedrock:abcdefg012927:0123456789876:guardrail/hello
      bedrock_guardrail_version: DRAFT
      bedrock_trace: enabled
      bedrock_temperature: 42

For more information, see the Amazon Bedrock Documentation.

AWS Redshift Support for Postgres Data Connector: Spice now supports connecting to Amazon Redshift using the PostgreSQL data connector. Redshift is a columnar OLAP database compatible with PostgreSQL, allowing you to use the same connector and configuration parameters.

To connect to Redshift, use the format postgres:schema.table in your Spicepod and set the connection parameters to match your Redshift cluster settings.

Example Spicepod.yaml:

# Example datasets for Redshift TPCH tables
datasets:
  - from: postgres:public.customer
    name: customer
    params:
      pg_host: ${secrets:PG_HOST}
      pg_port: 5439
      pg_sslmode: prefer
      pg_db: dev
      pg_user: ${secrets:PG_USER}
      pg_pass: ${secrets:PG_PASS}
  - from: postgres:public.lineitem
    name: lineitem
    params:
      pg_host: ${secrets:PG_HOST}
      pg_port: 5439
      pg_sslmode: prefer
      pg_db: dev
      pg_user: ${secrets:PG_USER}
      pg_pass: ${secrets:PG_PASS}

Redshift types are mapped to PostgreSQL types. See the PostgreSQL connector documentation for details on supported types and configuration.

Hadoop Catalog Support for Iceberg: The Iceberg Data and Catalog connectors now support connecting to Hadoop catalogs on filesystem (file://) or S3 object storage (s3://, s3a://). This enables connecting to Iceberg catalogs without a separate catalog provider service.

Example Spicepod.yaml:

catalogs:
  - from: iceberg:file:///tmp/hadoop_warehouse/
    name: local_hadoop
  - from: iceberg:s3://my-bucket/hadoop_warehouse/
    name: s3_hadoop

  # Example datasets
  - from: iceberg:file:///data/hadoop_warehouse/test/my_table_1
    name: local_hadoop
  - from: iceberg:s3://my-bucket/hadoop_warehouse/test/my_table_2
    name: s3_hadoop

For more details, see the Iceberg Data Connector documentation and the Iceberg Catalog Connector documentation.

Parquet Reader: Optional Parquet Page Index: Fixed an issue where the Parquet reader, using arrow-rs and DataFusion, errored on files missing page indexes, despite the Parquet spec allowing optional indexes. The Spice team contributed optional page index support to arrow-rs (PR #6) and configurable handling in DataFusion (PR #93). A new runtime parameter, parquet_page_index, makes Parquet Page Indexes configurable in Spice:

runtime:
  params:
    parquet_page_index: required # Options: required, skip, auto

required: (Default) Errors if page indexes are absent.
skip: Ignores page indexes, potentially reducing query performance.
auto: Uses page indexes if available; skips otherwise.

This improves compatibility and query flexibility for Parquet datasets.

Contributors

Breaking Changes

Amazon S3 Vectors Vector Engine: Amazon S3 Vectors is currently a preview AWS service. A recent update to the Amazon S3 Vectors service API introduced a breaking change that affects the integration when projecting (selecting) the embedding column. This results in the following error:

Json error: whilst decoding field 'data': expected [ got nullReceived only partial JSON payload from QueryVectors

The issue is expected to be resolved in the next release of Spice. A current workaround is to limit queries to non-embedding columns.

i.e. instead of:

SELECT url, title, scored, body_embedding
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

Remove the *_embedding column from the projection. E.g.

SELECT url, title, scored
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

This issue and workaround also applies to SELECT * FROM vector_search(..). E.g.

SELECT *
FROM vector_search(pulls, 'bugs in DuckDB', 4)
WHERE state = 'OPEN'
ORDER BY score DESC
LIMIT 4;

Cookbook Updates

Added Amazon Redshift Support to the Postgres Data Connector cookbook: Connect to tables in Amazon Redshift.

The Spice Cookbook includes 75 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.2, use one of the following methods:

CLI:

spice upgrade

Homebrew:

brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.2 image:

docker pull spiceai/spiceai:1.5.2

For available tags, see DockerHub.

Helm:

helm repo update
helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is also now available in the AWS Marketplace!

What's Changed

Dependencies

No major dependency updates.

Changelog

fixes for databricks OpenAI compatibility (#6629) by @Jeadie in #6629
Update spicepod.schema.json (#6632) by @app/github-actions in #6632
Remove 'stream_options' from databricks LLMs (#6637) by @Jeadie in #6637
Move retry and rate limiting logic for Amazon bedrock out of embeddings. (#6626) by @Jeadie in #6626
Disable Metal precomplation in integration_llms.yml (#6649) by @Jeadie in #6649
fix: Hadoop integration test (#6660) by @peasee in #6660
feat: Add Hadoop Catalog Data Component (#6658) by @peasee in #6658
update datafusion-table-providers to latest spiceai tag (#6661) by @mach-kernel in #6661
feat: Add Hadoop Catalog connectors for Iceberg (#6659) by @peasee in #6659
Make FullTextSearchExec robust to RecordBatch column ordering. (#6675) by @Jeadie in #6675
Make 'runtime-object-store' crate (#6674) by @Jeadie in #6674
fix: Support include for Iceberg (#6663) by @peasee in #6663
feat: Add Hadoop TPCH benchmark (#6678) by @peasee in #6678
feat: Add Hadoop metadata_path parameter (#6680) by @peasee in #6680
fix: Automatically infer Hadoop warehouse scheme (#6681) by @peasee in #6681
Amazon Bedrock, specifically Nova models (#6673) by @Jeadie in [#6673](https://github.com/spiceai/spiceai/pull/6673
fix perplexity_auth_token parameters for web_search (#6685) by @Jeadie in #6685
Fix AWS Auth issue (#6699) by @Advayp in #6699
Limit Concurrent Requests for GitHub (#6672) by @Advayp in #6672
Add runtime parameter to enable more permissive parquet reading when page indexes are missing (#6716) by @phillipleblanc in #6716
Improve Flight REPL error messages (#6696) by @lukekim in #6696
Fixes from search tests (#6710) by @Jeadie in #6710

What's New in v1.7.1​

Bug Fixes & Improvements​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Changelog​

What's New in v1.5.2​

Contributors​

Breaking Changes​

Cookbook Updates​

Upgrading​

What's Changed​

Dependencies​

Changelog​

What's New in v1.7.1

Bug Fixes & Improvements

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Changelog

What's New in v1.5.2

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog