Skip to main content
Version: Next

Runtime

The runtime section specifies configuration settings for the Spice runtime.

runtime.auth​

runtime.auth.api-key​

Spice supports adding optional authentication to its API endpoints via configurable API keys. Learn more.

runtime:
auth:
api-key:
enabled: true
keys:
- ${ secrets:api_key } # Use the secret replacement syntax to load the API key from a secret store
- 1234567890 # Or specify the API key directly

API key authentication supports the following configuration parameters:

Parameter nameOptionalDefaultDescription
enabledYestrueDefaults to true. Whether API key authentication is enabled
keysYes[]A list of API keys used to authenticate requests.

runtime.dataset_load_parallelism​

This setting specifies the maximum number of datasets that can be loaded in parallel during startup. By default, the number of parallel datasets is unlimited.

runtime.caching​

This setting specifies cache settings for supported Runtime components:

  • sql_results: Specifies cache settings for results from SQL queries.
  • search_results: Specifies cache settings for results from searches.
  • embeddings: Specifies cache settings for embeddings requests.

Runtime caches support common configuration parameters:

Parameter nameOptionalDefaultDescription
enabledYestrueDefaults to true.
max_sizeYes128MiBMaximum cache size. Defaults to 128MiB.
eviction_policyYeslruCache replacement policy when the cache reaches max_size. Defaults to lru. Supports lru (Least Recently Used) and tiny_lfu (Tiny Least Frequently Used, higher hit rate for skewed access patterns).
item_ttlYes1sCache entry expiration duration (Time to Live). Defaults to 1 second.
hashing_algorithmYesxxh3Selects which hashing algorithm is used to hash the cache keys when storing the results. Defaults to xxh3. Supports xxh3, ahash, siphash, blake3, xxh32, xxh64, or xxh128.

runtime.caching.search_results​

The search results cache section specifies runtime search cache configuration. Learn more.

runtime:
caching:
search_results:
enabled: true
max_size: 128MiB
item_ttl: 1s

The search results cache supports the common cache configuration parameters.

runtime.caching.embeddings​

The embeddings cache section specifies runtime embeddings requests cache configuration. Learn more.

runtime:
caching:
embeddings:
enabled: true
max_size: 128MiB
item_ttl: 1s

The embeddings cache supports the common cache configuration parameters.

runtime.caching.sql_results​

The SQL results cache section specifies runtime SQL query cache configuration. Learn more.

runtime:
caching:
sql_results:
enabled: true
max_size: 128MiB
item_ttl: 1s

In addition to the common cache configuration parameters, sql_results also supports the following parameters:

Parameter nameOptionalDefaultDescription
cache_key_typeYesplanDetermines how cache keys are generated. Defaults to plan. plan uses the query's logical plan, while sql uses the raw SQL query string.
encodingYesnoneCompression algorithm for cached results. Defaults to none. Supports none or zstd.
stale_while_revalidate_ttlYes0sDuration to serve stale cache entries while revalidating in the background. When set to a non-zero value, expired cache entries continue to be served while a background refresh occurs. Defaults to 0s (disabled).
info

runtime.results_cache has been deprecated and will be removed in a future release. If runtime.results_cache is specifed in the spicepod it will override the runtime.caching.sql_results settings if it is not defined.

Choosing a cache_key_type​

  • plan (Default): Uses the query's logical plan as the cache key. Matches semantically equivalent queries but requires query parsing.
  • sql: Uses the raw SQL string as the cache key. Provides faster lookups but requires exact string matches. Queries with dynamic functions, such as NOW(), may produce unexpected results. Use sql only when results are predictable.

Use sql for the lowest latency with identical queries that do not include dynamic functions. Use plan for greater flexibility.

Choosing a hashing_algorithm​

  • xxh3 (Default): Uses the XXH3 algorithm for hashing the cache keys. XXH3 is a fast, non-cryptographic hash algorithm that provides high performance and good distribution. It is suitable for scenarios where speed is critical and cryptographic security is not required.
  • siphash: Uses the SipHash1-3 algorithm for hashing the cache keys, the default hashing algorithm of Rust. This hashing algorithm is a secure algorithm that implements verified protections against "hash flooding" denial of service (DoS) attacks. Reasonably performant, and provides a high level of security.
  • ahash: Uses the AHash algorithm for hashing the cache keys. The AHash algorithm is a high quality hashing algorithm, and has claimed resistance against hashing DoS attacks. AHash has higher performance than SipHash1-3, especially when used with cache_key_type: plan.
  • blake3: Uses the BLAKE3 cryptographic hash function. BLAKE3 is a fast, parallelizable hash function that provides cryptographic security while maintaining high performance. It is suitable for scenarios requiring both speed and cryptographic guarantees.
  • xxh32, xxh64, xxh128: Variants of the XXH hashing algorithm with different output sizes. These algorithms offer a balance between speed and collision resistance, with larger hash sizes providing better collision resistance at the cost of performance.

Use xxh3 (the default) for its superior speed in most scenarios. Use ahash, xxh64 or xxh128 for reduced collision probability when caching a large number of queries. Use blake3 when cryptographic security is required. Use siphash when protection against hash flooding attacks is a priority.

runtime.params​

Optional. Global key-value parameters for the runtime.

HTTP Rate Control​

HTTP-based connectors (HTTP/HTTPS, GraphQL, GitHub) support the following rate control defaults:

Parameter NameDescription
http_max_concurrent_requestsDefault maximum concurrent HTTP requests per upstream origin. Can be overridden per-dataset with max_concurrent_requests.
http_requests_per_second_limitDefault maximum HTTP requests per second per upstream origin. Can be overridden per-dataset with requests_per_second_limit.
http_requests_per_minute_limitDefault maximum HTTP requests per minute per upstream origin. Can be overridden per-dataset with requests_per_minute_limit.
http_rate_control_jitter_minDefault minimum random delay before HTTP requests when rate control is active. Defaults to 5ms when a rate limit is configured. Can be overridden per-dataset.
http_rate_control_jitter_maxDefault maximum random delay before HTTP requests when rate control is active. Defaults to 10ms when a rate limit is configured. Can be overridden per-dataset.
runtime:
params:
http_max_concurrent_requests: 10
http_requests_per_second_limit: 5
http_requests_per_minute_limit: 200

runtime.source_rate_control​

Optional. Configures how Spice limits outbound requests to upstream data sources, and optionally enables cluster-wide coordination through persisted state in object storage.

Without state_location, rate limits are local to each Spice instance. When state_location is set, Spice instances coordinate through object storage so that a configured limit is shared across the cluster. For example, requests_per_second_limit: 20 means approximately 20 RPS total across all replicas, not 20 RPS per replica.

runtime:
source_rate_control:
state_location: s3://my-bucket/spice/rate-control/
refresh_interval: 30s
params:
s3_region: us-west-2
s3_key: ${ secrets:AWS_ACCESS_KEY_ID }
s3_secret: ${ secrets:AWS_SECRET_ACCESS_KEY }
github_concurrent_connections_limit: 10
Parameter NameOptionalDefaultDescription
state_locationYes-Root URI for globally persisted rate-control state (e.g. s3://bucket/path/). Enables cluster-wide rate control when set. Without this, limits are local to each Spice instance.
paramsYes-Object-store authentication parameters for state_location. Supports the same keys as other object-store configurations (e.g. s3_region, s3_key, s3_secret for S3; account, access_key for Azure). Supports ${ secrets:NAME } references.
refresh_intervalYes30sHow often each instance refreshes and persists per-source rate-control state. Longer intervals reduce object-store writes but adapt more slowly to demand changes.
github_concurrent_connections_limitYes10Maximum number of concurrent GitHub HTTP requests per authentication context. Replaces the deprecated runtime.params.github_max_concurrent_connections.

HTTP/API rate limits are configured through runtime.params (cluster defaults) and per-dataset overrides. Precedence is:

dataset param > runtime.params.http_* default > unset

When state_location is set, the configured RPS/RPM quota is converted into a token budget per lease window and distributed across replicas using a demand-weighted leased token-bucket model.

runtime.functions​

Controls whether functions declared in the top-level functions: section (and tools: entries with as_sql: true) are registered with the SQL engine. Defaults to disabled.

runtime:
functions:
enabled: true
ParameterOptionalDefaultDescription
enabledYesfalseWhen true, the runtime registers functions: entries and exposes them via SQL and /v1/functions.

When disabled, the functions: block is parsed but not registered, list_udfs() returns no user-source rows, and GET /v1/functions returns an empty array.

See the Functions Spicepod reference for the function declaration schema.

runtime.shutdown_timeout​

Controls how long Spice waits for connections to be gracefully drained and for components to shut down cleanly during runtime termination. Defaults to 30 seconds.

runtime:
shutdown_timeout: 1m

runtime.tls​

The TLS section specifies the configuration for enabling Transport Layer Security (TLS) for all endpoints exposed by the runtime. Learn more about enabling TLS.

In addition to configuring TLS via the manifest, TLS can also be configured via spiced command line arguments using the --tls-enabled true flag along with --tls-certificate/--tls-certificate-file and --tls-key/--tls-key-file.

runtime.tls.enabled​

Enables or disables TLS for the runtime endpoints.

runtime:
tls:
...
enabled: true # or false

runtime.tls.certificate​

The TLS certificate to use for securing the runtime endpoints. The certificate can also come from secrets.

runtime:
tls:
...
certificate: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
runtime:
tls:
...
certificate: ${secrets:tls_cert}

runtime.tls.certificate_file​

The path to the TLS PEM-encoded certificate file. Only one of certificate or certificate_file must be used.

runtime:
tls:
...
certificate_file: /path/to/cert.pem

runtime.tls.key​

The TLS key to use for securing the runtime endpoints. The key can also come from secrets.

runtime:
tls:
...
key: |
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----
runtime:
tls:
...
key: ${secrets:tls_key}

runtime.tls.key_file​

The path to the TLS PEM-encoded key file. Only one of key or key_file must be used.

runtime:
tls:
...
key_file: /path/to/key.pem

runtime.tls.client_auth_mode​

Enterprise Feature

mTLS (client certificate authentication) is included in the Enterprise distribution of Spice.ai. Learn more.

Controls whether the runtime requires, requests, or ignores client certificates on its public endpoints (HTTP, Flight, Metrics). Defaults to none.

ModeBehavior
none (default)Standard one-way TLS. No client certificate is requested.
requestThe server sends a CertificateRequest but accepts connections without a certificate. Presented certificates are verified against the configured CA. Useful for migration or audit-only deployments.
requiredA valid client certificate is required. The Flight (gRPC) listener rejects connections without a certificate at the TLS handshake. The HTTP listener admits no-cert connections so /health and /v1/ready remain accessible for Kubernetes probes, but all other HTTP endpoints return 401 without a verified client certificate. The metrics listener has no client-auth gate.

Requires client_auth_ca_file or client_auth_ca to be set when mode is request or required.

runtime:
tls:
enabled: true
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem
client_auth_mode: required
client_auth_ca_file: /path/to/client-ca.pem

runtime.tls.client_auth_ca_file​

Path to a PEM-encoded CA bundle used to verify client certificates. The file is watched for changes and reloaded atomically alongside the server certificate and key.

runtime:
tls:
...
client_auth_ca_file: /path/to/client-ca.pem

runtime.tls.client_auth_ca​

Inline PEM (or ${ secrets:... }) form of the client CA bundle. Mutually exclusive with client_auth_ca_file. Inline material is loaded once at startup and is not hot-reloaded.

runtime:
tls:
...
client_auth_ca: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----

runtime.task_history​

The task history section specifies runtime task history configuration. For more details, see the Task History documentation.

runtime:
task_history:
enabled: true
captured_output: none
retention_period: 8h
retention_check_interval: 15m
min_sql_duration: 5s
Parameter nameOptionalDescription
enabledYesDefaults to true.
captured_outputYesSpecifies the level of output captured by the task history table. Defaults to none.
captured_planYesControls SQL query plan capture. Options: none (default), explain, or explain analyze. Query plans are captured asynchronously after query completion.
min_sql_durationYesMinimum query execution duration before a plan is captured. Only queries exceeding this threshold are captured. Example: 5s.
min_plan_durationYesMinimum plan execution duration before a plan is captured. This threshold applies to the execution time of the EXPLAIN operation itself. Example: 10s.
retention_periodYesSpecifies how long records in the task history table are retained. Defaults to 8h (8 hours).
retention_check_intervalYesSpecifies how often old records are checked for removal. Defaults to 15m (15 minutes).

runtime.cors​

The CORS section specifies the configuration for enabling Cross-Origin Resource Sharing (CORS) for the HTTP endpoint. By default, CORS is disabled.

Default configuration:

runtime:
cors:
enabled: false

runtime.cors.enabled​

Enables or disables CORS for the HTTP endpoint. Defaults to false.

runtime.cors.allowed_origins​

A list of allowed origins for CORS requests. Defaults to ["*"], which permits all origins.

Example:

runtime:
cors:
enabled: true
allowed_origins: ['https://example.com']

This configuration permits requests only from the https://example.com origin.

runtime.query.memory_limit​

The memory_limit parameter sets a memory usage cap for the Spice runtime query engine. This limit applies only to the query engine and should be used in addition to other memory configuration options, such as duckdb_memory_limit. When the limit is reached, DataFusion spills intermediate data to disk using the directory configured in runtime.query.temp_directory.

If not specified, defaults to 90% of total system memory (container-aware).

runtime:
query:
memory_limit: 4GiB

Specify the value as a size, for example 4GiB or 1024MiB.

For detailed memory information, see Memory.

runtime.query.spill_compression​

The spill_compression parameter configures compression for spill files generated during large query execution in the Spice runtime.

Supported values:

  • zstd (default): Enables high compression ratios for spill files, reducing disk usage but with moderate (de)compression speed.
  • lz4_frame: Provides faster (de)compression, resulting in larger spill files and potentially higher disk usage.
  • uncompressed: Disables compression. Spill files will be the largest, but with no (de)compression overhead.
runtime:
query:
spill_compression: lz4_frame

This setting controls the trade-off between disk space usage and query performance for large-scale analytics workloads.

runtime.query.temp_directory​

The path to a temporary directory that Spice uses for query and acceleration operations that spill to disk. For more details, see the Managing Memory Usage documentation and the DuckDB Data Accelerator documentation.

runtime:
query:
temp_directory: /tmp/spice

runtime.output_level​

Controls verbosity in addition to the existing CLI and environment variable support.. Supported values are info, verbose, and very_verbose. The value is applied in the following priority: CLI, environment variables, then YAML configuration.

runtime:
output_level: info # or verbose, very_verbose

runtime.telemetry​

The telemetry section configures runtime telemetry collection and export. Learn more.

runtime:
telemetry:
enabled: true
otel_exporter:
enabled: true
endpoint: 'localhost:4317'
push_interval: '5m'

runtime.telemetry.enabled​

Enables or disables runtime telemetry collection. Defaults to true.

runtime.telemetry.metric_prefix​

Optional string prepended to every exported metric name. Useful for namespacing Spice metrics in shared backends (e.g. Datadog, Grafana Cloud, New Relic) so they do not collide with metrics from other services. Defaults to no prefix.

The prefix applies to all metric readers — the Prometheus scrape endpoint (--metrics), the cluster on-demand OTLP reader, and the otel_exporter push exporter — because OpenTelemetry views are configured at the meter-provider level rather than per reader.

runtime:
telemetry:
metric_prefix: 'spiceai.'

With this configuration, the runtime metric query_duration_ms is exported as spiceai.query_duration_ms.

runtime.telemetry.properties​

Map of custom key/value attributes attached to telemetry metrics emitted by spiced. Applied as OpenTelemetry resource attributes on the runtime's MeterProvider, so they appear as dimensions/tags on every metric exported via the Prometheus scrape endpoint, the cluster on-demand OTLP reader, and the otel_exporter push exporter. Defaults to empty.

runtime:
telemetry:
properties:
environment: prod
region: us-west-2
team: data-platform

The standard OpenTelemetry environment variables (OTEL_SERVICE_NAME, OTEL_RESOURCE_ATTRIBUTES) are still honored and act as defaults; explicit properties entries take precedence on key conflicts.

For backends that map OTLP resource attributes to tags through additional configuration (e.g. Datadog), see the Datadog OTLP guide.

runtime.telemetry.otel_exporter​

Configures an OpenTelemetry metrics exporter to push metrics to an OpenTelemetry collector. The exporter automatically infers the protocol (gRPC or HTTP) based on the endpoint configuration.

Parameter nameOptionalDefaultDescription
enabledYestrueWhether the OpenTelemetry exporter is enabled.
endpointNo-The OpenTelemetry collector endpoint. Protocol is inferred from the format (see examples below).
push_intervalYes60sHow frequently metrics are pushed to the collector. Specify as a duration.
metricsYes[]List of metric names to export. When empty (default), all metrics are exported.
headersYes{}Map of headers to send with each export request. For HTTP these are sent as HTTP headers; for gRPC they are sent as metadata entries (keys must be lowercase ASCII). Values support the ${secrets:...} replacement syntax for loading credentials from a secret store.

Protocol inference:

  • gRPC (default): Use a bare host:port endpoint without a scheme (e.g., localhost:4317). gRPC uses port 4317 by default.
  • HTTP: Include the http:// or https:// scheme and the /v1/metrics path (e.g., http://localhost:4318/v1/metrics). HTTP uses port 4318 by default.

Examples:

gRPC configuration:

runtime:
telemetry:
enabled: true
otel_exporter:
# gRPC - no scheme or path needed
endpoint: 'localhost:4317'
push_interval: '30s'

HTTP configuration:

runtime:
telemetry:
enabled: true
otel_exporter:
enabled: true
# HTTP - include scheme and /v1/metrics path
endpoint: 'http://localhost:4318/v1/metrics'
push_interval: '30s'

With metric filtering (export only specific metrics):

runtime:
telemetry:
enabled: true
otel_exporter:
endpoint: 'localhost:4317'
push_interval: '30s'
metrics:
- query_duration_ms
- query_executions
- dataset_load_state
Filtering happens after metric_prefix is applied

The whitelist is matched against the final metric name, after runtime.telemetry.metric_prefix has been prepended. If you set metric_prefix: 'spiceai.', the entries under metrics: must include the prefix (e.g. spiceai.query_duration_ms), otherwise nothing will match and no metrics will be exported.

Authenticated exporters:

For collectors that require authentication, set the headers map. Load credentials from a secret store via ${secrets:...} rather than committing them to source.

Datadog (OTLP/HTTP) — replace us3 with your Datadog site:

runtime:
telemetry:
otel_exporter:
endpoint: 'https://otlp.us3.datadoghq.com/v1/metrics'
headers:
DD-API-KEY: ${secrets:dd_api_key}

Grafana Cloud (OTLP/HTTP) — use the base64 instanceID:accessPolicyToken from the Grafana Cloud OpenTelemetry connection page:

runtime:
telemetry:
otel_exporter:
endpoint: 'https://otlp-gateway-us-central2.grafana.net/otlp/v1/metrics'
headers:
Authorization: 'Basic ${secrets:grafana_cloud_auth}'

gRPC collector with auth metadata (keys must be lowercase ASCII):

runtime:
telemetry:
otel_exporter:
endpoint: 'otel-collector.internal:4317'
headers:
api-key: ${secrets:collector_api_key}

runtime.metrics​

Specifies metrics that are disabled by default.

Following metrics are disabled by default:

  • dataset_acceleration_max_timestamp_before_refresh_ms
  • dataset_acceleration_max_timestamp_after_refresh_ms
  • dataset_acceleration_refresh_lag_ms
  • dataset_acceleration_ingestion_lag_ms

For details about these metrics, see Observability.

runtime:
metrics:
- name: dataset_acceleration_max_timestamp_before_refresh_ms
- name: dataset_acceleration_max_timestamp_after_refresh_ms
enabled: true
- name: dataset_acceleration_refresh_lag_ms
enabled: false
- name: dataset_acceleration_ingestion_lag_ms

runtime.flight​

Configures Arrow Flight protocol settings for the runtime.

runtime:
flight:
max_message_size: 16MiB
do_put_rate_limit_enabled: true
Parameter nameOptionalDefaultDescription
max_message_sizeYes-Maximum size of a single Arrow Flight message.
do_put_rate_limit_enabledYestrueWhether rate limiting is applied to DoPut Arrow Flight operations.

runtime.mcp​

Configures settings for the Spice MCP server endpoint (/v1/mcp).

runtime.mcp.allowed_hosts​

Controls which Host header values are accepted on the /v1/mcp endpoint. This prevents DNS rebinding attacks against the MCP server.

BehaviorConfiguration
Default (not set)Only localhost, 127.0.0.1, and ::1 are permitted. Requests with any other Host value receive 403 Forbidden.
Explicit listReplaces the defaults entirely. Only the listed hosts are accepted.
Wildcard (["*"])Disables host checking — all Host header values are accepted.
runtime:
mcp:
allowed_hosts:
- localhost
- my-host.internal:8090

To disable host checking entirely:

runtime:
mcp:
allowed_hosts:
- "*"

Each entry can be a bare hostname (example.com), a host-port pair (example.com:8090), or a full origin URL (https://example.com).

runtime.ready_state​

Controls when the runtime readiness probe (/v1/ready) reports the runtime as ready. This is particularly useful for Kubernetes readiness probes.

runtime:
ready_state: on_load
ValueDescription
on_load (default)The runtime reports ready after all components (datasets, models, etc.) have loaded successfully.
on_registrationThe runtime reports ready as soon as all components have been registered, before they finish loading.

runtime.scheduler​

Configures the cluster scheduler when running Spice in cluster mode. This section is relevant only when using --role scheduler.

runtime:
scheduler:
state_location: s3://my-bucket/spice-cluster-state/
params:
s3_region: us-east-1
partition_management:
interval: 30s
max_assignments_per_cycle: 100
max_partitions_per_executor: 1000
discovery_timeout: 60s
Parameter nameOptionalDefaultDescription
state_locationNo-Root URI for shared cluster state storage (e.g. s3://bucket/path/).
paramsYes-Object store parameters (e.g. aws_region).
partition_management.intervalYes30sHow often the scheduler runs partition assignment cycles.
partition_management.max_assignments_per_cycleYes100Maximum number of partition assignments per cycle.
partition_management.max_partitions_per_executorYes1000Maximum number of partitions assigned to a single executor.
partition_management.discovery_timeoutYes60sHow long the scheduler waits for executor discovery before timing out.