Caching
The gateway caches proxy responses on first download; every device after that is served locally. The cache is tiered by how mutable the content is:
| Tier | TTL | What it covers |
|---|---|---|
| Immutable | 7 days | Docker blobs/manifests by sha256, manifests by git-SHA tag |
| Mutable | 1 hour | Model weights, RFDM packages, Docker manifests by named tag |
| Scripts | 5 min | /scripts/* |
| Volatile | 60 s | /v2/_catalog |
Backends
Two backends. Disk (default) stores entries under CACHE_DIRECTORY with size-based eviction once CACHE_MAX_SIZE_GB is reached. S3 — enabled by setting CACHE_S3_BUCKET — lets multiple replicas share one cache and keeps pods stateless. Any S3-compatible store works: AWS S3, MinIO, Ceph RadosGW, Google Cloud Storage (interop), or Cloudflare R2. Encryption at rest is delegated to the storage system.
Cache control headers
Clients can send X-Cache-Bypass: true to skip the cache, or Cache-Control: no-cache to force revalidation. Every response carries X-Cache: HIT|MISS|REVALIDATED|BYPASS.
Cache management API
The /_cache/* endpoints require Authorization: Bearer <CACHE_ADMIN_TOKEN> (or DEBUG=true):
GET /_cache/stats # cache statistics
GET /_cache/entries # list (filters: ?tier= ?prefix= ?since= ?before=)
DELETE /_cache/entries # purge matching entries
DELETE /_cache/entries/{key} # delete a single entry