Skip to main content
Version: Next

Spice.ai Deployment Guide

Spice runs as a single binary, a container, a Kubernetes workload, or a fully managed app on the Spice Cloud Platform. This guide helps choose a target environment and a deployment architecture to match an application's latency, scale, and operational requirements.

Choose a deployment target​

Most users fall into one of three groups:

Self-hosted enterprise deployments

For production self-hosted clusters, the Spice.ai Enterprise Kubernetes Operator provides per-replica StatefulSets, automatic PVC resizing, configurable update strategies, crashloop protection, and distributed query execution through SpicepodSet and SpicepodCluster custom resources.

Deployment architectures​

Architecture refers to where Spice runs in relation to the application and data sources, and how it scales. Pick an architecture before choosing a guide; the same target environment can host any of these patterns.

  • Overview — when to choose each architecture.
  • Sidecar — Spice runs alongside the application for the lowest latency.
  • Microservice — single or multiple replicas behind a load balancer.
  • Tiered — separate read and write tiers for mixed workloads.
  • Cluster-Sidecar — combine local and remote Spice instances.
  • Hosted — managed on the Spice Cloud Platform.
  • Sharded — partition data across multiple Spice instances.
  • Cluster — distributed query execution with Spice.ai Enterprise.

Deployment guides​

Step-by-step instructions for each target environment.

GuideWhen to use
KubernetesSelf-hosted production deployments. Covers Helm, Argo CD, and Flux.
DockerLocal development, single-host deployments, and container-based pipelines.
Spice CloudFully managed deployments without operating infrastructure.
AWSDeployments on AWS using the published CloudFormation template.
AzureDeployments on Azure using ARM/Bicep templates.
GCPDeployments on Google Cloud using GKE, Cloud Run, or Compute Engine.
CI/CDAutomating any of the above through pipelines or GitOps.
Read/Write SeparationProduction pattern that splits ingest from reads using shared snapshots.