Backlog: R2 garbage collection / orphan sweeper
TL;DR
Files uploaded to R2 are never automatically deleted. Bill grows linearly forever. Today mg deletes manually. Need a scheduled sweeper for orphans (DB record gone) and expired (expiration_date past) materials.
Description
- Define orphan = R2 object whose key is not referenced by any
t_files row.
- Define expired =
t_files row with expiration_date < now() past a grace window (e.g. 7 days).
- Daily job (separate from cleanup-expired or extending it) that:
- Lists R2 objects with pagination.
- Diffs against DB.
- Reports orphans first (dry run for at least 1 week), then enables actual deletion.
- Track storage size over time so client can be billed pass-through accurately.
Acceptance Criteria
Priority
- Priority: p1
- Rationale: cost grows without bound; not user-facing yet.
Dependencies
- Related: p0 — cron reminders (same scheduler infra)
Links