Configuration

bakelite is driven by a single TOML file. A [defaults] table applies to every database; any field can be overridden inline inside a [[database]] block.

A minimal config

[[database]]
name = "app"
path = "/var/lib/app/app.db"
  [[database.backends]]
  type = "file"
  path = "/var/backups/bakelite"

That's enough to run. Every tunable below has a sensible default, so you can leave them alone until you've got a reason not to — reach for one when you want to change a specific behaviour.

Defaults & tunables

[defaults]
debounce              = "250ms"   # coalesce write bursts into one sync
max_batch_delay       = "1s"      # force a flush at least this often (~RPO)
safety_poll           = "30s"     # cheap fallback poll; NOT a busy loop
busy_timeout          = "5s"
max_wal_size          = "16MiB"   # trigger a checkpoint when the WAL grows past this
manifest_flush        = "10s"     # how often to flush the manifest index
snapshot_interval     = "1d"      # daily full snapshot
retention             = "0s"      # 0s = keep all backups; "7d" = 7 days
compaction_levels     = ["30s", "5m", "1h"]   # consolidation windows per level; [] disables
compaction_keep_recent = 16       # keep this many recent change-sets fine-grained for point-in-time restore
max_segments_per_snapshot = 0   # 0 = disabled; new full backup after this many change-sets to bound restore-chain length
max_total_size        = 0         # 0 = no cap; e.g. "5GiB" to bound stored bytes per db
max_backups           = 0         # 0 = no cap; e.g. 30 to bound retained backups
multipart_sweep_interval = "1h"   # S3: how often to abort orphaned multipart uploads; "0s" disables
multipart_min_age     = "1h"      # S3: only abort uploads older than this (protects in-flight uploads)
mirror_interval       = "60s"     # how often background async mirrors reconcile (see Multi-destination)
compression           = "zstd"    # "zstd" | "lz4" | "none"
validate_on_restore   = true      # multi-destination: validate objects and fall back to a healthy copy
on_incomplete_destination = "backfill"  # multi-destination: backfill | refuse | warn (see Multi-destination)

Key	Meaning
`debounce`	Coalesce a burst of writes into a single sync.
`max_batch_delay`	Upper bound on how long a write waits before being shipped — effectively your RPO.
`safety_poll`	A cheap fallback poll in case a filesystem event is missed. Not a busy loop.
`busy_timeout`	SQLite busy timeout for the control connection.
`max_wal_size`	When the WAL crosses this, bakelite ships everything and TRUNCATEs the WAL (incremental checkpoint). Accepts a unit string like `16MiB` / `512kB` or a raw byte count.
`manifest_flush`	How often to flush the manifest index to the backend. The stored change-set objects are the source of truth, so restore/resume reconcile a lagging manifest by listing; lower = less work to recover after a crash, higher = fewer backend writes. Always flushed at snapshot, checkpoint, compaction, and graceful shutdown.
`snapshot_interval`	How often to take a fresh full backup. Shorter intervals keep point-in-time restore fast and memory-light; longer ones reduce storage overhead.
`retention`	Prune backups older than this window. `"0s"` keeps everything; the current backup is never pruned.
`compaction_levels`	Consolidation windows, one per level (e.g. `["30s", "5m", "1h"]` = merge into 30s windows, then 5m, then 1h). bakelite automatically merges older incremental change-sets into coarser windows as they age, so storage and per-restore object count stay bounded while recent restore stays fine-grained. `[]` disables. Windows must strictly increase.
`compaction_keep_recent`	Keep this many recent incremental change-sets un-merged so recent point-in-time restore stays precise.
`max_segments_per_snapshot`	Take a fresh full backup once this many incremental change-sets have shipped since the last one, bounding restore-chain length even on a near-idle database whose `snapshot_interval` rarely fires. `0` disables (re-snapshotting stays purely time-/size-driven).
`max_total_size`	Ceiling on total stored bytes per database (the bill-shock guard). When a new backup pushes the replica past this, bakelite prunes the oldest backups to get back under — never the current one. `0` disables. Accepts a unit string like `5GiB`.
`max_backups`	Cap on retained backups per database; excess oldest backups are pruned (never the current one). `0` disables.
`multipart_sweep_interval`	S3 only: how often the daemon aborts orphaned incomplete multipart uploads (snapshot uploads a crash left unfinalized). `"0s"` disables. No-op on non-S3 backends and on hosts without static AWS env credentials. See S3 storage overhead.
`multipart_min_age`	S3 only: only abort multipart uploads at least this old, so an in-flight snapshot upload is never aborted by the sweep. `bakelite reclaim --min-age` overrides per invocation.
`mirror_interval`	How often the background reconciler copies new objects to each `async` mirror destination (Multi-destination → Async mirrors). No-op when a database has no `async` destinations.
`compression`	Page compression: `zstd`, `lz4`, or `none`.
`validate_on_restore`	When a database has 2+ destinations, validate every object on restore/verify/list and transparently fall through to a healthy copy on another destination if one has bit-rotted. Default `true`; free for a single-destination database (it's only built when there's a sibling to fall back to). See Redundancy & bit-rot recovery.
`on_incomplete_destination`	Multi-destination only: what to do on resume when a destination is missing part of the current backup chain — typically a fan-out destination you added or repointed at a new location after replication began. `backfill` (default) copies the existing chain onto it so it's immediately restorable; `refuse` stops the database until you resolve it (fail-safe); `warn` logs and keeps going. No effect with a single destination. See Adding or moving a destination.

Duration fields accept a unit-suffixed string: ms, s (or sec/secs), m (or min/mins), h (or hr/hrs), d (or day/days) — e.g. "250ms", "30s", "5m", "1h", "7d". Bare integers are deliberately rejected so a missing unit can't silently mean the wrong thing (250 interpreted as ms vs s is a 1000× footgun).

Byte-size fields accept either a raw integer (bytes) or a unit string: SI units (kB, MB, GB) are 1000-based and IEC units (KiB, MiB, GiB) are 1024-based.

Every tunable can be set under [defaults] or overridden per [[database]]:

[[database]]
name = "analytics"
path = "/data/analytics.db"
safety_poll = "15s"                   # per-database override

Daemon tuning

Process-wide knobs that affect the daemon itself, not any one database. The matching CLI flags (e.g. bakelite daemon --max-concurrent-snapshots 1) override these for a single invocation; otherwise the daemon picks up the configured value, otherwise its built-in default.

[daemon]
max_concurrent_snapshots = 4               # 0 (or absent) = auto: one at a time
snapshot_workers         = 2               # 0 (or absent) = auto: single-threaded
metrics_addr             = "127.0.0.1:9090"   # omit to disable the endpoint

Key	Meaning
`max_concurrent_snapshots`	Cap on databases snapshotting concurrently across the whole daemon. Defaults to one at a time for a low, steady footprint; raise it to bootstrap many databases faster on a host with spare cores (at a sharper CPU/IO spike).
`snapshot_workers`	zstd worker threads per snapshot. Defaults to single-threaded (lowest memory); larger values speed up one big database's snapshot in exchange for more RSS.
`metrics_addr`	Prometheus scrape endpoint (host:port). See `bakelite daemon`'s `--metrics-addr` for the exposed series.

Setting any of these in the config means the systemd unit can stay generic — no ExecStart override needed when you re-tune for a new host.

Storage limits & usage

Object stores have no natural ceiling, so a backup that just keeps growing can quietly run up a bill — even cheap storage adds up with what you keep. bakelite has a couple of limits for that, plus a way to see what's actually stored.

Limits (max_total_size, max_backups, all opt-in). When a new backup would push a database past a cap, bakelite prunes the oldest backups until it's back under — and it never deletes the current backup. If a cap still can't be met (e.g. a single database is larger than max_total_size), bakelite logs a warning and the usage command flags the database, rather than throwing away your only restorable copy.

bakelite usage reports what's actually stored:

bakelite usage                 # one line per database: total, backups, age
bakelite usage --db app        # per-backup breakdown (full vs incremental)
bakelite usage --db app --json # machine-readable, for monitoring/alerting

When limits are configured, the output shows usage against the cap (146 KiB / 1.00 MiB (14%)) and flags any database that is over.

Backends

Local filesystem

[[database.backends]]
type = "file"
path = "/var/backups/bakelite"

S3-compatible

Works with AWS S3, Cloudflare R2, Backblaze B2, MinIO, and others — see Backends for the per-provider compatibility matrix and how each is tested. Credentials come from the environment (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY). bakelite loads /etc/bakelite/bakelite.env at startup automatically (the same file systemd reads, also picked up by interactive CLI invocations) — see Environment variables for the full list and load order. If no credentials are reachable, the command fails immediately with a clear "no S3 credentials found" error.

On EC2/ECS/EKS? If you rely on an instance profile, task role, or pod identity instead of static keys, set BAKELITE_AWS_USE_DEFAULT_CREDENTIAL_CHAIN=1 to opt into the AWS default credential chain (including the instance-metadata service). Without it, bakelite refuses to probe instance metadata, so a missing key on a non-AWS host fails fast instead of timing out against 169.254.169.254.

[[database.backends]]
type = "s3"
bucket = "my-bucket"
prefix = "bakelite"
endpoint = "https://ACCOUNT.r2.cloudflarestorage.com"
region = "auto"
# force_path_style = true           # default when `endpoint` is set
# request_timeout = "30s"           # cap on a single request (also on gcs/azure)
# max_retries = 10                  # retry attempts per request (also on gcs/azure)

With a custom endpoint, path-style addressing is the default (required by B2 and MinIO), and region must match the service.

request_timeout and max_retries (the latter on the object-store backends — S3, GCS, Azure) tune the underlying client. Leave them unset to keep object_store's defaults, which already retry with exponential backoff and time out requests; set them only to tighten or loosen behaviour for a particular link.

IAM permissions

bakelite needs to read, write, and delete objects, plus a handful of bucket-level list/inspect actions. It never creates the bucket, changes its configuration, or deletes object versions. This least-privilege policy (attach it to the user or group whose credentials bakelite runs as) covers everything — replication, multipart-upload reclaim, and bakelite usage reporting:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "BakeliteObjectRW",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:AbortMultipartUpload"
      ],
      "Resource": "arn:aws:s3:::my-bucket/*"
    },
    {
      "Sid": "BakeliteBucketList",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads",
        "s3:GetBucketVersioning",
        "s3:ListBucketVersions",
        "s3:GetBucketObjectLockConfiguration"
      ],
      "Resource": "arn:aws:s3:::my-bucket"
    }
  ]
}

What bakelite does	S3 action(s)
Read/write/delete snapshots, segments, the manifest cache, `CURRENT` (large snapshots stream as multipart)	`GetObject`, `PutObject`, `DeleteObject`
List snapshots and segments	`ListBucket`
Sweep and abort orphaned multipart uploads	`ListBucketMultipartUploads`, `AbortMultipartUpload`
`bakelite usage` version/overhead reporting	`GetBucketVersioning`, `ListBucketVersions`
`bakelite doctor` Object Lock verification (`expect_object_lock`)	`GetBucketObjectLockConfiguration`

A few things to note:

Object vs. bucket ARNs. The object actions target my-bucket/*; the bucket-level list/inspect actions target my-bucket (no /*). They have to be split across the two resources — a bucket action on a /* resource (or the reverse) silently never matches.
Deliberately absent. No s3:CreateBucket / s3:PutBucket*, no s3:DeleteObjectVersion, and no s3:BypassGovernanceRetention. Expiring noncurrent versions is the bucket owner's job, not bakelite's — see the versioning discussion below for why — and withholding the bypass permission means bakelite can't shorten an Object Lock retention even if its credentials are compromised.
Tightening to a prefix. On a bucket shared with other workloads, narrow the object ARN to my-bucket/bakelite/* (matching the prefix in the TOML above) and optionally add "Condition": {"StringLike": {"s3:prefix": ["bakelite/*"]}} to the ListBucket statement. Leave the multipart and versioning actions bucket-wide — bakelite filters those client-side, so they aren't reliably prefix-scopable. See Shared backend for many databases.
Reporting needs static env credentials. The GetBucketVersioning / ListBucketVersions calls run on static AWS env credentials (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY), so on IAM-role / instance-metadata hosts they no-op and bakelite usage reports the overhead as "not inspected" regardless of this policy.
SSE-KMS adds key permissions. If you set sse = "aws:kms" (above), the credentials also need kms:GenerateDataKey (to write) and kms:Decrypt (to read/restore) on the configured key — granted on the KMS key's policy, not in the bucket policy above.

S3 storage overhead a plain listing can't see

A normal object listing — and therefore the storage bakelite's limits act on — counts only the current version of each object. Two kinds of cost hide behind it on S3:

Incomplete multipart uploads. Snapshots stream to S3 as multipart uploads; a crash mid-upload can leave the parts behind — billed, but invisible to a plain listing. bakelite cleans these up: a failed upload aborts at source, and the daemon periodically sweeps any left by a hard crash (multipart_sweep_interval, touching only uploads older than multipart_min_age so an in-flight upload is never aborted). bakelite reclaim --db <name> does the same on demand (--dry-run to preview, --min-age 0s to force-abort everything — best with the daemon stopped). This needs only s3:AbortMultipartUpload / s3:ListBucketMultipartUploads.
Noncurrent versions and delete markers. On a versioned bucket, every overwrite and delete leaves a noncurrent version (or delete marker) behind that still costs money. bakelite deliberately does not touch these — it has no way to delete an object version, by design. Versioning is a disaster-recovery boundary: you enable it so your backups can't be wiped, including by bakelite or by anything that compromises its credentials. Letting bakelite expire versions would mean handing it s3:DeleteObjectVersion, the very permission that defeats the guarantee. Expiring noncurrent versions is therefore the bucket owner's job, via a server-side lifecycle policy that runs with the bucket's own authority:

{
  "Rules": [
    {
      "ID": "bakelite-expire-noncurrent",
      "Status": "Enabled",
      "Filter": { "Prefix": "" },
      "NoncurrentVersionExpiration": { "NoncurrentDays": 7 },
      "AbortIncompleteMultipartUpload": { "DaysAfterInitiation": 1 }
    }
  ]
}

aws s3api put-bucket-lifecycle-configuration \
  --bucket my-bucket --lifecycle-configuration file://lifecycle.json
# MinIO: mc ilm rule add --noncurrent-expire-days 7 myalias/my-bucket

Backblaze B2 models lifecycle natively rather than through the S3 rules above: each rule has daysFromHidingToDeleting (B2's NoncurrentDays equivalent — purge a version this many days after it's superseded or deleted) and daysFromUploadingToHiding (leave this null so B2 never hides a current object on you). The quickest path is the bucket's Lifecycle Settings → "Keep only the last version of the file" in the web console (that preset is exactly daysFromHidingToDeleting: 1); or set it via the B2 API / b2 CLI with a rule like:

{ "fileNamePrefix": "", "daysFromUploadingToHiding": null, "daysFromHidingToDeleting": 7 }

Without this, B2 keeps every superseded version forever — and bakelite churns the per-database manifest object frequently, so a tiny live replica can hide gigabytes of noncurrent versions. bakelite usage flags it; the rule bounds it.

A caveat for B2 keys. The "even a compromised credential can't wipe your versions" guarantee above holds on S3, where you grant bakelite s3:DeleteObject but withhold s3:DeleteObjectVersion. It does not hold on Backblaze B2: B2's deleteFiles capability — which bakelite needs for compaction and retention — also permits permanently deleting a specific version, and B2's web console only mints broad Read-Only / Read-Write keys (granular capabilities aren't exposed in the UI). So a console-created B2 key, or a host that leaks it, can delete noncurrent versions — lifecycle rule or not. Two things help:

Mint a narrower key with the b2 CLI (the UI won't): restrict it to the one bucket and the capabilities bakelite actually uses — this drops the bucket-admin and file-sharing powers a console key bundles in:
```
b2 account authorize <MASTER_KEY_ID> <MASTER_APP_KEY>
b2 key create --bucket my-bucket bakelite \
  listFiles,readFiles,writeFiles,deleteFiles,readBuckets,readBucketRetentions,readBucketLifecycleRules
```
deleteFiles still has to stay (compaction and retention delete objects), so this shrinks the blast radius but can't make versions un-deletable.
For a boundary that survives credential compromise, use Object Lock (below): the bucket enforces it server-side, regardless of what the key is allowed to do.

The AbortIncompleteMultipartUpload rule above is still worth keeping as a backstop: it's free, runs continuously even when the daemon is down, and covers hosts where bakelite's own sweep can't run — multipart and version inspection (and the daemon sweep) use static AWS env credentials (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY), so on IAM-role / instance-metadata hosts they no-op and report "not inspected".

bakelite usage --db <name> quantifies all of this — noncurrent-version bytes, delete-marker counts, and incomplete multipart uploads — so the hidden cost stays visible. With the lifecycle rule in place and bakelite's sweep running, bakelite usage should report the overhead trending to zero.

Immutable backups with Object Lock (WORM)

Object Lock makes backups that can't be deleted — not with stolen credentials, and not by bakelite itself — which protects them against accidental deletion or a credential compromise. On S3 that's Object Lock: enable it on the bucket with a default retention, and every object bakelite writes is locked immutable for that window.

This works because bakelite only ever adds data objects — snapshots and change-sets are written once and never deleted or overwritten in place (the tiny CURRENT pointer and manifest are updated as new versions, which a versioned, locked bucket allows) — so it runs against a locked bucket unchanged. It also can't weaken the lock: it never deletes object versions and is never granted s3:BypassGovernanceRetention, so even a compromised daemon can't shorten the retention.

bakelite never configures Object Lock itself — set it on the bucket (it must be enabled at creation, alongside versioning):

aws s3api create-bucket --bucket my-bucket --object-lock-enabled-for-bucket #...
# A bucket-wide default retention then locks every new object:
aws s3api put-object-lock-configuration --bucket my-bucket \
  --object-lock-configuration \
  '{"ObjectLockEnabled":"Enabled","Rule":{"DefaultRetention":{"Mode":"COMPLIANCE","Days":30}}}'

[[database.backends]]
type = "s3"
bucket = "my-bucket"
expect_object_lock = true     # bakelite doctor fails if the bucket isn't locked

[defaults]
retention = "0s"              # don't try to prune — Object Lock denies the deletes
compaction_levels = []        # don't rewrite/merge — same reason

Turn off deletion-based maintenance. Object Lock blocks deletes, so bakelite's own pruning can't run while objects are locked. If you leave retention / compaction enabled against a locked bucket, bakelite degrades gracefully rather than crashing — pruning is skipped (and logged), and compaction churns (it writes merged objects it then can't clean up). bakelite doctor warns about exactly this. Let Object Lock plus a bucket lifecycle policy govern expiry instead.

Set checksum = true (see above): an Object-Lock bucket with retention rejects any upload that doesn't carry a checksum — single-part PutObject and multipart alike — so checksum = true is required for Object Lock regardless of database size, not just for large snapshots.

expect_object_lock = true turns immutability into a checked invariant: bakelite doctor reports the lock mode and retention window and fails if the bucket isn't actually locked, so you can gate a deploy on it. Verification needs the s3:GetBucketObjectLockConfiguration permission (in the policy above) and the same static AWS env credentials as the version-overhead inspector — without either, doctor reports it as unverifiable (a warning) rather than failing, even on a locked bucket. It's checked against the primary (first) backend.

COMPLIANCE mode can't be shortened or removed by anyone — including the root account — until each object's retention expires; GOVERNANCE mode allows a privileged override. For a backup you never want weakened, COMPLIANCE is stronger, but it's irreversible: size the retention window deliberately.

Cheaper cold storage with storage classes

Most of a replica's bytes are cold: the base snapshots and the older, compacted change-sets are rarely read — restore usually only touches the most recent ones. Parking that cold bulk in a cheaper storage class can cut the storage bill substantially, while recent change-sets stay hot for fast restore.

[[database.backends]]
type = "s3"
bucket = "my-bucket"
storage_class = "STANDARD_IA"   # or GLACIER_IR, INTELLIGENT_TIERING, …

bakelite applies storage_class to the cold objects — base snapshots and compacted (level ≥ 1) segments — and leaves hot data (recent raw change-sets, the manifest, and the CURRENT pointer) in the bucket default. You don't choose per-object; bakelite routes by kind.

Instant-retrieval classes only. Restore reads objects immediately, with no thaw step, so only classes that are instantly readable are allowed:

Provider	Allowed (instant)	Rejected (needs a thaw)
S3	`STANDARD_IA`, `ONEZONE_IA`, `INTELLIGENT_TIERING`, `GLACIER_IR`	`GLACIER` (Flexible), `DEEP_ARCHIVE`
GCS	`NEARLINE`, `COLDLINE`	—
Azure	`Cool`, `Cold`	`Archive`

A thaw-requiring tier — S3 GLACIER (Flexible) / DEEP_ARCHIVE, or Azure Archive — is rejected at config load with a clear error: those need an asynchronous restore/rehydrate (minutes to hours) before an object can be read, which would break bakelite restore. The check is provider-aware and exact, so GLACIER_IR (Glacier Instant Retrieval) is allowed, and so is GCS ARCHIVE — every GCS class, Archive included, is served instantly with no rehydration. (Automated thaw-on-restore for the archival tiers is a planned future feature.)

Cost nuance. Infrequent-access and cold tiers trade a lower per-GB storage price for per-request retrieval fees and minimum-storage-duration / early-deletion charges. They pay off when snapshots are large and restores are rare; pair with a longer snapshot_interval so you're not re-uploading cold full snapshots often. They're not free for hot, churny data — which is why bakelite keeps recent change-sets in the default class.

Server-side encryption and upload integrity

Two optional S3 data-protection knobs, independent of bakelite's client-side encryption:

[[database.backends]]
type = "s3"
bucket = "my-bucket"
checksum = true                 # S3 validates a SHA-256 on every upload
sse = "aws:kms"                 # server-side at-rest encryption (or "aes256")
sse_kms_key_id = "arn:aws:kms:us-east-1:123456789012:key/abcd-…"

sse turns on server-side at-rest encryption, managed by the provider: "aes256" is SSE-S3 (S3-managed keys, zero extra config); "aws:kms" is SSE-KMS with your customer-managed key (sse_kms_key_id required — and the credentials need kms:GenerateDataKey + kms:Decrypt on that key, the latter both at restore and for the multipart snapshot uploads bakelite performs). This is orthogonal to bakelite's own client-side encryption, which keeps the provider from ever seeing plaintext (stronger for confidentiality), while SSE is the familiar compliance checkbox and protects the at-rest copy if you don't encrypt client-side. Use either, both, or neither.

checksum = true has bakelite compute a SHA-256 for each upload and send it so S3 validates object integrity at write time — catching client→S3 corruption immediately, on top of bakelite's own CRC envelope (which catches corruption on read). It's also required by S3 Object Lock: a bucket with retention rejects any upload — single-part or multipart — that lacks a checksum, so set checksum = true whenever you use Object Lock, whatever the database size.

Both are S3-only; GCS and Azure encrypt at rest by default and aren't configured here.

Google Cloud Storage

[[database.backends]]
type = "gcs"
bucket = "my-gcs-bucket"
prefix = "bakelite"
# service_account_path = "/etc/bakelite/gcs-key.json"   # off-GCP: explicit key

Credentials are resolved by object_store the way the Google tooling expects: a service-account file or JSON key in the environment (GOOGLE_SERVICE_ACCOUNT / GOOGLE_SERVICE_ACCOUNT_KEY / GOOGLE_APPLICATION_CREDENTIALS), the gcloud Application Default Credentials file, or the GCE metadata server when running on Google Cloud. Off-GCP, point service_account_path at a downloaded service-account JSON key instead. The service account needs object read/write/list and delete on the bucket (roughly the roles/storage.objectAdmin role, or a custom role with storage.objects.{get,create,delete,list}).

Verified against a live GCS bucket. Beyond the shared backend conformance suite, the native GCS backend is exercised end-to-end against a real Google Cloud Storage bucket by the opt-in gcs-azure-providers.yml CI leg (the gcs_conformance entrypoint). Authenticate with a service-account key, or with Application Default Credentials (gcloud auth application-default login) for an off-GCP host.

Azure Blob Storage

[[database.backends]]
type = "azure"
account = "mystorageacct"   # storage account name
container = "backups"       # the bucket equivalent
prefix = "bakelite"

The credential is read from the environment so it never sits in the TOML: set AZURE_STORAGE_ACCOUNT_KEY (or its alias AZURE_STORAGE_ACCESS_KEY) to the account key, a SAS token, or service-principal variables (AZURE_STORAGE_CLIENT_ID / AZURE_STORAGE_CLIENT_SECRET / AZURE_STORAGE_TENANT_ID). bakelite loads /etc/bakelite/bakelite.env at startup (see Environment variables), so the same file that holds your other secrets works here. The credential needs blob read/write/list/delete on the container (the Storage Blob Data Contributor role, or an equivalent SAS). Only the public-cloud endpoint (<account>.blob.core.windows.net) is targeted; a custom endpoint for the Azurite emulator or sovereign clouds isn't exposed yet.

Verified against a live Azure account. Beyond the shared backend conformance suite, the native Azure backend is exercised end-to-end against a real Azure Storage account by the opt-in gcs-azure-providers.yml CI leg (the azure_conformance entrypoint). The credential is read from the environment — account key, SAS token, or service principal.

GCS and Azure share the same storage layer as S3. The S3-only extras — bakelite usage noncurrent-version reporting and the multipart-upload reclaim sweep — don't apply to them; expire old data with the provider's own lifecycle/versioning controls. See Backends for the compatibility matrix.

SFTP

Back up to any SSH host. The transport is pure Rust (russh

russh-sftp) — there's no system ssh binary and no OpenSSL/libssh2 C dependency, so the single static binary keeps working.

[[database.backends]]
type = "sftp"
host = "backup.example.com"
# port = 22
user = "bakelite"
path = "/srv/backups/bakelite"          # remote base dir (relative to login dir if not absolute)
identity_file = "/etc/bakelite/.ssh/id_ed25519"
# passphrase_env = "SFTP_KEY_PASS"       # env var holding the key's passphrase, if encrypted
# password_env = "SFTP_PASSWORD"         # env var holding the password (keeps it out of the config)
# password = "..."                       # last resort: inline password (plaintext in the config)
# known_hosts = "/etc/bakelite/known_hosts"
# insecure_skip_host_key_check = false
# connect_timeout = "30s"
# request_timeout = "30s"

Authentication is by SSH key (identity_file, recommended) or password. Secrets stay out of the config: an encrypted key's passphrase and the password both come from the environment, the same way the S3/Azure credentials do — so bakelite.toml can be checked in.
- Key auth — point identity_file at the private key. If it's encrypted, supply the passphrase via passphrase_env (names the env var to read) or the fixed BAKELITE_SFTP_KEY_PASSPHRASE; a passphrase-less key needs neither.
- Password auth — set password_env to the name of an env var holding the password, or rely on the fixed BAKELITE_SFTP_PASSWORD. Put the value in bakelite.env (chmod 600) like any other secret. The inline password field still works as a last resort, but it sits in plaintext in the config — prefer the environment. (When more than one source is set, the most specific wins: password_env over inline password over BAKELITE_SFTP_PASSWORD.)
Reconnects on drop. The session is established lazily and reused; if it drops (server restart, network blip, idle timeout), the next operation transparently reconnects and retries rather than wedging until the daemon restarts. connect_timeout (handshake + auth + subsystem) and request_timeout (a single request's response) bound how long a dead link can hang — both default to 30s.
Host-key verification is on by default: the server's key is checked against known_hosts (default ~/.ssh/known_hosts). The host must already have an entry — add one with ssh-keyscan backup.example.com >> ~/.ssh/known_hosts, or connect once with ssh. If the key isn't found (or has changed), bakelite refuses to connect. Set insecure_skip_host_key_check = true to skip the check — convenient on a trusted LAN, but it disables MITM protection, so never use it over the open internet.
It's the local-filesystem backend over a network: writes go to a temp file then rename into place, and the same databases/... tree is created under path. The S3-only extras (version inspection, multipart reclaim) don't apply.

Conformance-tested. The SFTP backend passes the same Backend-trait suite as every other backend, against a disposable atmoz/sftp server — run it with just sftp-up && just test-sftp.

Shared backend for many databases

Backing up many databases to the same destination? Configure the backend once as a top-level [[backends]] and list databases with just name + path — objects are namespaced by database name automatically. A per-database [[database.backends]] still overrides the shared one.

[[backends]]
type = "s3"
bucket = "my-bucket"
prefix = "bakelite"
endpoint = "https://s3.us-west-001.backblazeb2.com"
region = "us-west-001"

[[database]]
name = "user1"
path = "/var/lib/rnd/users/1/data.db"

[[database]]
name = "user2"
path = "/var/lib/rnd/users/2/data.db"

Environment variables

Some settings live in the environment rather than the TOML — secrets (so they don't end up in a config that gets checked in), and a few process-wide knobs that don't belong per-database. bakelite reads them itself, so they work the same way under systemd and from an interactive shell.

Where they come from

At startup, bakelite loads variables from the first available source per variable — already-set environment variables always win:

$BAKELITE_ENV_FILE if set (explicit; missing path is a hard error).
/etc/bakelite/bakelite.env (silently skipped if absent).
$XDG_CONFIG_HOME/bakelite/bakelite.env (or ~/.config/bakelite/bakelite.env).

Under systemd, the EnvironmentFile=-/etc/bakelite/bakelite.env line in the unit pre-populates the daemon's environment; the in-process loader is a no-op there because everything is already set. For interactive CLI use, the loader removes the need to source the file yourself.

Use bakelite doctor to see which files were considered and which variables are set (secret values are redacted).

The variables

Variable	Used by	Purpose
`BAKELITE_CONFIG`	CLI	Path to the config TOML. Overridden by `--config`.
`BAKELITE_LOG`	CLI/daemon	Tracing filter, e.g. `info`, `bakelite=debug`.
`BAKELITE_ENV_FILE`	CLI/daemon	Explicit env file to load before all other discovery.
`AWS_ACCESS_KEY_ID`	S3 backend	Static access key.
`AWS_SECRET_ACCESS_KEY`	S3 backend	Static secret key.
`AWS_SESSION_TOKEN`	S3 backend	Temporary STS token, if you're using one.
`AWS_DEFAULT_REGION`	S3 backend	Default region; per-backend `region` in TOML wins.
`BAKELITE_AWS_USE_DEFAULT_CREDENTIAL_CHAIN`	S3 backend	Set to `1` to opt into the AWS default credential chain (IMDS/ECS/EKS). Off by default so a missing key on a non-AWS host fails fast instead of timing out.
`GOOGLE_SERVICE_ACCOUNT` / `GOOGLE_SERVICE_ACCOUNT_KEY` / `GOOGLE_APPLICATION_CREDENTIALS`	GCS backend	Service-account file path, inline JSON key, or ADC file path. Or set `service_account_path` in the TOML.
`AZURE_STORAGE_ACCOUNT_KEY` / `AZURE_STORAGE_ACCESS_KEY`	Azure backend	Storage-account key (the two names are aliases). SAS / service-principal variables (`AZURE_STORAGE_CLIENT_ID` / `_SECRET` / `_TENANT_ID`) are also honoured.
`BAKELITE_SFTP_PASSWORD`	SFTP backend	Password for password auth. Used when a backend's `password_env` is unset; a backend's own `password_env` (naming a different variable) takes precedence.
`BAKELITE_SFTP_KEY_PASSPHRASE`	SFTP backend	Passphrase for an encrypted `identity_file`. Used when a backend's `passphrase_env` is unset.
`BAKELITE_KEY`	encryption	Inline `BAKELITE-KEY-V1-…` key — overrides `[encryption]` in the config.
`BAKELITE_KEY_FILE`	encryption	Path to a key file — overrides `[encryption]`.
`BAKELITE_PASSPHRASE`	encryption	Inline passphrase — overrides `[encryption]`.
`BAKELITE_PASSPHRASE_FILE`	encryption	Path to a passphrase file — overrides `[encryption]`.

At-rest encryption

Optional. When set, every snapshot / change-set / manifest payload is encrypted with XChaCha20-Poly1305 (an authenticated AEAD cipher) before being uploaded; object keys, the current-backup pointer, and listings stay in the clear — they carry only the information needed to locate and route objects, never database contents.

bakelite uses a symmetric key model: one key encrypts and decrypts, held by the daemon and by every CLI command (restore, verify, compact, list, usage). This protects backups against a stolen disk or an S3-bucket breach, and against inspection in transit; it does not try to protect them from a compromised daemon host (which already has the live plaintext database).

Each object is encrypted independently and bound to its identity — its database name and position in the replica travel as the AEAD's associated data, so the backend can't relocate, swap, or roll back an object without the authentication failing on read. The cipher authenticates the writer on every object; there's no separate manifest MAC to maintain.

What encryption does and does not protect

Threat	Protected?
Passive read of the backend (stolen disk, bucket breach, on-the-wire snoop)	Yes — payloads are confidential; only the key decrypts them.
Active write to the backend (compromised bucket creds, MITM, malicious provider) — content	Yes (with `require_encrypted`, the default). Forged/substituted objects are rejected: plaintext is refused outright, and because each object is bound to its identity (database + position) as AEAD associated data, an attacker can't forge one, relocate it, or swap one object for another without decryption failing. `restore`/`verify` also bind every object to its recorded hash.
Active write — rollback / deletion	Partly. An attacker can't forge data, but can still roll the replica back to a genuine earlier state (by repointing `CURRENT`) or delete backups (an availability attack). `verify` warns when `CURRENT` doesn't point at the newest full backup, and `restore --timestamp <recent>` resolves by backup time independently of `CURRENT` — so a repointed pointer can't fool it (it can't recover deleted backups, though).
Compromised daemon host	Out of scope — it already holds the live plaintext database.

What encryption gives you is confidentiality plus, with require_encrypted, tamper-evidence for content. The trust anchor is the AEAD itself: only a holder of the key can produce an object that authenticates, and each object is bound to its identity (database + position), so it can't be forged, swapped, or relocated. The CRC envelope + BLAKE3 object_hash then also catch accidental corruption (bit-rot, truncation — the common failure mode). What still leaks even with encryption on: database names, object sizes/counts, and write cadence are visible in the cleartext object keys and listings (routing metadata).

require_encrypted (default true). Reads reject any object that isn't encrypted, closing the downgrade where an attacker swaps ciphertext for attacker-chosen plaintext. Set it to false only while migrating a previously-plaintext replica (see Toggling or rotating).

Transport: an endpoint beginning with http:// disables TLS — credentials (the access-key id and any session token) and all metadata then travel in the clear and signed requests are replayable. Use https:// in production; plain http:// is only for local emulators (MinIO/LocalStack on loopback).

# One-time: generate a key (mode 0600, refuses to overwrite).
bakelite keygen --output /etc/bakelite/key.txt
# Produces a BAKELITE-KEY-V1-… file.

# Apply to every database:
[defaults.encryption]
key_file = "/etc/bakelite/key.txt"

# …or per-database:
[[database]]
name = "secrets"
path = "/data/secrets.db"
  [database.encryption]
  key_file = "/etc/bakelite/secrets-only.txt"

A per-database [database.encryption] overrides the shared [defaults.encryption]. Omitting both leaves a database unencrypted.

Runtime overrides keep the secret out of the TOML on shared hosts: set BAKELITE_KEY_FILE (or BAKELITE_KEY / the passphrase variants) and the env value wins over the config. See Environment variables for the full list and how the env file is loaded.

Keep an off-host copy of the key. The key is the one thing your backups don't contain — lose it and every encrypted backup is unrecoverable, however safe the bucket is. Store a copy somewhere independent of both the database host and the backup bucket (a password manager, a separate secrets store, a sealed envelope): a key kept only on the machine being backed up dies with that machine, and one kept only in the same bucket falls to the same breach. The daemon and every CLI command (restore, verify, compact, list, usage) need it.

Rotating the key means starting over from a fresh full backup (point a new key_file and let the daemon take a new snapshot).

Toggling or rotating

Both directions — enabling encryption on a previously-plaintext database and disabling encryption on a previously-encrypted one — work without manual intervention: edit [database.encryption] (or remove it) and restart the daemon. The two directions behave differently because they have to.

Enabling encryption on a database that already has plaintext backups needs a short migration, because the secure default require_encrypted = true refuses to read those legacy plaintext objects. Set the key and require_encrypted = false together: the wrapper then notices each on-backend object is plaintext (no ciphertext header) and passes it through unchanged on read, so legacy snapshots and segments stay readable while every new object the daemon ships is encrypted. A replica can hold a mix during this window, and restore / verify walk both. Then run bakelite reencrypt to rewrite the legacy objects under the key, and finally remove the require_encrypted = false override (back to the strict default) and restart. A fresh database (no prior plaintext) needs none of this — just set the key and leave require_encrypted at its default.

Disabling encryption is the only case that can't resume in-place. The daemon has no key to read the existing encrypted manifest, so it warns and bootstraps a fresh plaintext full backup. The encrypted history is left untouched on the backend (clean it up with backend tools when no longer needed) and stays restorable only by temporarily flipping the config back to the old encrypted setup.

Wrong-key or genuinely corrupt objects are not treated as a mode mismatch — they keep their loud failure (the wrapper only passes bytes through that plainly aren't ciphertext), so a silent re-bootstrap can't mask a real incident.

# Enable encryption on a database that already has PLAINTEXT backups:
bakelite keygen --output /etc/bakelite/key.txt
# edit /etc/bakelite/bakelite.toml, add (note the migration override):
#   [defaults.encryption]
#   key_file = "/etc/bakelite/key.txt"
#   require_encrypted = false          # temporary: lets the daemon read legacy plaintext
sudo systemctl restart bakelite
bakelite verify --db <name>            # mixed-mode replica reads cleanly
# Rewrite legacy plaintext objects under the new key (preserves PIT history):
sudo systemctl stop bakelite
sudo bakelite reencrypt --db <name>
# Then DROP the override (back to the strict default) and restart:
#   remove the `require_encrypted = false` line
sudo systemctl start bakelite

Rotating the key

To rotate to a different encryption key while preserving the full point-in-time history, run bakelite reencrypt with the prior key supplied via --old-key-file. The walker tries the configured target key first (no-op for already-rotated objects), falls back to the prior key for the legacy ones, and re-uploads each under the target. Daemon-down per database; idempotent (rerun safely after a crash or interrupted run).

# Rotate from key A to key B:
bakelite keygen --output /etc/bakelite/key-b.txt
# edit bakelite.toml: key_file = "/etc/bakelite/key-b.txt"
sudo systemctl stop bakelite
sudo bakelite reencrypt --db <name> --old-key-file /etc/bakelite/key-a.txt
# Rerun if interrupted — the second pass skips already-rotated objects.
sudo systemctl start bakelite
# Once the old key isn't needed for any remaining backups, archive or destroy it.

--old-key-file is repeatable, so chains like A → B → C work even if you skipped a rotation. Tried in declaration order, so list your most-recent prior key first.

Decrypting back to plaintext

Symmetric: remove [encryption] from the config, then run reencrypt with --old-key-file pointing at the current key. Every object is rewritten in the clear, the old encrypted history is fully recoverable. The same flow with --dry-run --json first shows the size of the work without touching the backend.

# edit bakelite.toml: remove the [encryption] block
sudo systemctl stop bakelite
sudo bakelite reencrypt --db <name> --old-key-file /etc/bakelite/key.txt --dry-run --json
sudo bakelite reencrypt --db <name> --old-key-file /etc/bakelite/key.txt
sudo systemctl start bakelite

bakelite reencrypt reuses the same advisory lock the daemon takes, so if you forget to stop the daemon it refuses with a clear message instead of racing the live writer.

Passphrase mode

For lower-friction setups bakelite can derive its key from a typed passphrase instead of a key file. Under the hood bakelite runs Argon2id once at startup to derive the symmetric key from the passphrase, then uses the same encryption path as key-file mode. A recovery host only needs the passphrase to restore; no key file to ship.

Set exactly one of key_file, passphrase, or passphrase_file per [encryption] block — they are mutually exclusive (setting more than one is a config error). The two passphrase forms are:

# Inline — handy for tests or throw-away setups.
[defaults.encryption]
passphrase = "correct horse battery staple"

# Or read from a file (first non-blank, non-`#`-comment line wins).
[defaults.encryption]
passphrase_file = "/etc/bakelite/passphrase.txt"

Setting more than one of key_file / passphrase / passphrase_file in the same [encryption] block is an error — the loader rejects the ambiguity up front. Env overrides also exist for the passphrase path: BAKELITE_PASSPHRASE=<text> or BAKELITE_PASSPHRASE_FILE=<path>.

Trade-off. A generated key file is 256 bits of pure entropy. A typed passphrase is whatever you can remember — gated by Argon2id's cost. The math: with a strong passphrase the brute-force cost stays infeasible; with a weak one ("password", a dictionary word), an attacker who steals your backups can guess it. If security really matters to you, use key_file; passphrase mode is there for the times the ergonomics matter more than the absolute strength.

Multi-destination replication

Optional. Replicate one database to more than one backend for redundancy — local NVMe + S3, two regions, anything supported by Backends. Every write fans out to all destinations in lock-step; restore picks the first reachable.

List more than one [[database.backends]] entry (TOML's array-of-tables syntax) and every write fans out to all of them:

[[database]]
name = "critical"
path = "/data/critical.db"
  [[database.backends]]
  type = "file"
  path = "/srv/replicas/critical"   # local mirror, instant restore

  [[database.backends]]
  type = "s3"
  bucket = "offsite-backups"
  prefix = "bakelite/critical"      # cloud copy, disaster recovery

The shared top level supports the same shape: a top-level [[backends]] defines a default fan-out set for every [[database]] that doesn't override it.

Guarantees, in plain terms

Bit-for-bit mirrors. Every destination holds the same object bytes for every key — encryption (when configured) runs once and the ciphertext is fanned out, so adding destinations doesn't multiply CPU cost.
Strict in-sync. A sync round either lands on every destination or fails as a whole and is retried (segment puts are idempotent — re-shipping the same index/level overwrites with the same bytes). A consistently slow destination paces the whole pipeline — the trade is a much simpler "every mirror always in step" invariant. (Add or repoint a destination after replication has begun and bakelite handles it on the next start — see Adding or moving a destination.)
Resilient reads. Restore tries destinations in order and falls through to the next on any failure — a network blip, a missing object, a wiped mirror, or a corrupt copy (see below). Listings union across destinations and dedupe, so a wiped primary doesn't hide data the secondary still holds.

Redundancy and bit-rot recovery

Replicate to two servers and a restore survives one of them silently rotting a backup — bakelite reads the next copy automatically. That's the payoff of more than one destination, and it's on by default.

When a database has 2+ destinations, restore (and verify/list) validates every object as it reads it — the container CRC, the AEAD authentication tag under encryption, and the recorded content hash — and on any mismatch falls through to a healthy copy on another destination. A single SFTP/SSH server that bit-rots one backup object no longer fails the restore; a sibling covers for it. Both sync backends and async mirrors are in the pool of copies to fall back to.

This is safe because every destination holds byte-identical objects, so any one valid copy is authoritative. If every copy of an object is corrupt, restore fails loudly — it never hands back a silently-wrong database.

Two ways the corruption surfaces:

At restore time, transparently — the restore summary notes recovered N object(s) from a healthy copy and names the rotting destination. bakelite verify reports the same, so a scheduled verify catches a degraded destination before you ever need to restore.
Proactively, with bakelite repair: it walks every object across all destinations and rewrites a healthy copy over any that are corrupt or missing, healing the rot in place.

Set validate_on_restore = false to opt out (read from the first destination without validation). It defaults to true and adds no cost to a single-destination database — the validating reader is only built when there's a sibling to fall back to. There's no effect on the replication hot path either way.

Async mirrors (off the hot path)

A sync destination must accept every write before a sync round completes, so a consistently slow one paces the whole pipeline. Mark a destination mode = "async" instead and bakelite reconciles it in the background — live replication only waits on the sync destinations (at least one is required; the first is the primary):

[[database]]
name = "critical"
path = "/data/critical.db"
  [[database.backends]]
  type = "file"
  path = "/srv/replicas/critical"   # sync primary — instant local restore
  [[database.backends]]
  type = "s3"
  bucket = "offsite-backups"
  prefix = "bakelite/critical"
  mode = "async"                    # background mirror — never paces the hot path

A background task copies new objects from the primary to each async mirror every mirror_interval (default 60s; a per-database tunable). Because bakelite's data objects are write-once and content-addressed, this is a simple list-diff-copy that copies stored bytes verbatim — so it mirrors ciphertext under encryption without needing the key. The mirror is eventually consistent: it advances its CURRENT pointer only to a restore point it has fully copied, so it's always a valid, restorable prefix of the primary, at most ~mirror_interval behind. A slow or down async mirror is logged and retried — it never fails or slows live replication.

The mirror also propagates the primary's deletions: each reconcile pass prunes objects the primary has retired through retention or compaction. A successful full listing gates the prune, and it never removes the run its own CURRENT restores from — so the mirror tracks the primary rather than growing without bound, while staying a valid restorable prefix.

Adding or moving a destination

If you add a destination to an already-running database — or repoint an existing one at a new, empty location — that destination starts out holding none of the existing backup chain. Because sync destinations are written in lock-step, it would otherwise receive only new change-sets from that point on, leaving it without a base snapshot underneath them: not independently restorable, and only bakelite verify would catch it.

Each time the daemon starts and resumes an existing database (not on a fresh bootstrap, which already fans out a complete base to every destination), it compares the snapshot/segment objects each sync destination holds against the combined set across all of them. A destination missing any of those data objects is incomplete, and bakelite applies on_incomplete_destination:

Value	Behaviour
`backfill` (default)	Copy the existing chain onto the incomplete destination from a destination that holds a complete copy — the same verbatim list-diff-copy the async mirror uses (ciphertext as-is, no key needed) — so it's immediately a full, independently restorable replica. If no destination holds a complete copy (they've genuinely diverged), bakelite refuses and asks you to resolve it by hand rather than copy from a partial peer.
`refuse`	Stop the database and report which destination is incomplete, rather than ship onto a partial chain. Fail-safe: nothing is written until you resolve it.
`warn`	Log a warning and keep going. The destination stays unrestorable until the next full snapshot rebases the chain.

This only applies to sync destinations; async mirrors always catch up through their own background reconcile. The check runs at daemon start on resume, so a destination that goes incomplete mid-run isn't re-detected until the next daemon restart or the next full snapshot rebases the chain. Set it under [defaults] or per-database. To move a destination cleanly without a backfill, you can instead remove the old one, let a fresh snapshot rebase, then add the new location.

Limitations

The metrics endpoint and bakelite usage report once per database, against the primary destination's view. (Under strict in-sync the sync destinations agree; the limitation matters only after divergence.) bakelite status additionally shows a line per async mirror with its lag (objects/bytes behind) and last reconcile time.
The single replication cursor means a slow sync destination pulls the fast ones down with it — mark such a destination mode = "async" (above) to take it off the hot path. Per-destination sync cursors with independent backoff are not v1.