Backends
bakelite writes replicas to a pluggable storage backend. Five families exist today:
- Local filesystem — a directory (often a mounted volume). Simple and fully supported.
- S3 / S3-compatible — built on the
object_storecrate, so the same implementation serves AWS S3, Cloudflare R2, Backblaze B2, and MinIO. See Configuration → S3-compatible for per-provider config (endpoint, region, path-style) and the least-privilege IAM policy to attach. Supports immutable backups via Object Lock (WORM) — bakelite is append-only, so it runs against a locked bucket unchanged — and cheaper cold storage via storage classes (the cold bulk to an instant-retrieval tier, hot change-sets in the default). - Google Cloud Storage — the same
object_storelayer. Credentials resolve from a service-account key or Application Default Credentials. See Configuration → Google Cloud Storage. - Azure Blob Storage — likewise. Account + container in the config; the credential comes from the environment. See Configuration → Azure Blob Storage.
- SFTP (over SSH) — copy to any SSH host. The transport is pure Rust
(
russh+russh-sftp), so there's no systemsshbinary and no OpenSSL/libssh2 C dependency — the single static binary keeps working. Authenticate by SSH key (recommended) or password; the server's host key is verified againstknown_hosts. It's the local-filesystem backend over a network (no object-store version inspection or multipart, which are S3-only concepts). See Configuration → SFTP.
GCS and Azure go through the identical ObjectStoreBackend mapping that the S3
backend uses, so they go through the same Backend-trait conformance tests the in-memory
and S3 suites run. The S3-specific version-overhead
inspection (bakelite usage noncurrent-version reporting) and multipart reclaim
are S3-only — on GCS/Azure, as on R2, bakelite usage reports that overhead as
"not inspected". A dedicated live-target CI leg (gcs-azure-providers.yml) now
runs the native GCS and Azure backends against real accounts, mirroring
s3-providers.yml.
Why a compatibility matrix
"S3-compatible" is a spectrum. Services agree on the core object API but diverge at
the edges — versioning semantics, multipart-upload listing, lifecycle behavior.
We learned this the hard way: MinIO's ListMultipartUploads silently ignores the
prefix parameter (real AWS honours it), which would have made multipart reclaim
miss orphans on MinIO had we relied on server-side filtering. Emulators don't
always match real-service behavior (the MinIO prefix handling above is one
example), so bakelite is tested against the real services too, not just emulators.
How it's tested
One parameterized conformance suite (s3_conformance in
crates/bakelite-core/tests/backend_conformance.rs) runs against every target by
reading BAKELITE_TEST_S3_* env. It exercises four layers:
- the full
Backendtrait contract (incl. the streaming snapshot/multipart path); - S3 inspection — versioning + version-overhead reporting;
- reclaim end-to-end — create a dangling multipart upload, list it, age-gate it, abort it, and check it's removed (the regression guard for the prefix divergence);
- a capability probe that records raw provider behavior (it never asserts) and
emits
BAKELITE_CAPABILITY_JSON {…}, the source for the table below.
- Emulators run on every PR (MinIO + LocalStack) via the
s3-emulatorCI job. - Real providers run opt-in (AWS, R2, B2) via the secret-gated
s3-providers.ymlworkflow (workflow_dispatch+ weekly), green-skipping when a provider's credentials aren't configured. - Native GCS + Azure run opt-in via the secret-gated
gcs-azure-providers.ymlworkflow (same triggers), through thegcs_conformance/azure_conformanceentrypoints. These exercise layer 1 (the trait contract) against a live account; the S3-only inspection/reclaim/capability layers don't apply. - Locally:
just minio-up && just test-s3,just localstack-up && just test-localstack, orjust test-provider <label>against a real bucket (export itsAWS_*+BAKELITE_TEST_S3_{ENDPOINT,BUCKET,REGION,PATH_STYLE}first).
Compatibility matrix
| Provider | Trait conformance | Versioning inspection | Multipart | ListMultipartUploads prefix honoured | Noncurrent versions | Delete markers |
|---|---|---|---|---|---|---|
| Local filesystem | ✅ | — | — | — | — | — |
| SFTP | ✅⁴ | — | — | — | — | — |
| AWS S3 | ✅ | ✅ enabled | ✅ | ✅ | ✅ | ✅ |
| Cloudflare R2 | ✅ | ❌ (403)¹ | ✅ | ✅ | —¹ | —¹ |
| Backblaze B2 | ✅ | ✅ enabled | ✅ | ✅ | ✅ | ✅ |
| MinIO | ✅ | ✅ enabled | ✅ | ❌² | ✅ | ✅ |
| LocalStack (3.x) | ✅ | ✅ (off by default) | ✅ | ✅ | — | — |
| Google Cloud Storage | ✅³ | —³ | ✅ | — | —³ | —³ |
| Azure Blob | ✅³ | —³ | ✅ | — | —³ | —³ |
Measured by the suite's capability probe — emulator rows in PR CI, real-provider
rows from s3-providers.yml; the GCS/Azure trait-conformance column comes from
gcs-azure-providers.yml.
Notes:
- Cloudflare R2 answers
GetBucketVersioning/ListObjectVersionswith 403 — it doesn't expose the object-versioning APIs. bakelite tolerates this:bakelite usagereports version overhead as "not inspected" on R2, and replication, restore, and multipart reclaim all work normally (R2 does support multipart, and honours the prefix). It only means R2's hidden-version overhead can't be reported. - MinIO silently ignores the
prefixonListMultipartUploadsand returns every upload. bakelite never relies on server-side prefix filtering (it lists whole-bucket and filters client-side), so reclaim is correct here regardless — this column documents the divergence that drove that design. Every other tested service honours the prefix. - Google Cloud Storage / Azure Blob use the same
object_store-backedObjectStoreBackendas S3. Their trait conformance (✅³) is now exercised against a live account by the dedicatedgcs-azure-providers.ymlCI leg (gcs_conformance/azure_conformance), on top of the shared in-memory and S3 suites. The S3-only version-overhead inspection and multipart reclaim don't apply (—³); large snapshots still stream as multipart viaobject_store(GCS resumable uploads / Azure block blobs). Expire old data with the provider's own lifecycle/object-versioning controls. See the per-provider setup notes under Configuration → Google Cloud Storage and Azure Blob Storage. - SFTP runs the same
Backend-trait conformance suite the other backends do, against a disposableatmoz/sftpserver — locally viajust sftp-up && just test-sftp. Being a plain remote filesystem, the S3-only columns (versioning, multipart) don't apply.
Noncurrent versions / delete markers are only observable on a bucket with versioning enabled, and bakelite never deletes object versions — object versions are a separate disaster-recovery layer that bakelite leaves alone; expire noncurrent versions with a bucket lifecycle policy. See S3 storage overhead.