Tideglass / release / sign-off

payments-api v3.2 cutover — go / no-go

Eleven calls the launch crew raised before flipping the new payments service live. Each is pre-set to the recommended answer. Flip the ones you disagree with, then export.

How this works: every card starts on Accept (the recommended call, in green). Flip a card to Change and type your override in the note. Hit Export sign-off for a plain-text digest of every decision you can paste into the release ticket. Choices save in this browser, so a reload keeps your edits.
A

Rollout shape

a1How do we ramp traffic onto v3.2?
Recommend: Canary at 5% for 30 minutes, then 25 / 50 / 100 with a 15-minute hold at each step. A held step lets a regression surface before it reaches everyone.
Alternative: cut 100% over at once during the low-traffic window.
a2What auto-rolls-back the cutover?
Recommend: Error rate above 0.5% or p99 latency above 800ms sustained for 3 minutes, evaluated per ramp step. Either trip reverts the canary weight to zero with no human in the loop.
a3How long does v3.1 stay warm after 100%?
Recommend: Keep v3.1 deployed and drainable for 72 hours so a revert is a weight change, not a redeploy.
B

Data & migrations

b1When does the ledger_entries column migration run?
Recommend: Run the additive migration the day before, with both old and new columns populated by a dual-write. The cutover reads the new column; nothing blocks on a live ALTER TABLE.
Alternative: migrate inline during the maintenance window.
b2How is the dual-write backfill verified before go?
Recommend: A reconciliation job compares old-column and new-column checksums per account; go is blocked until the mismatch count is zero across two consecutive runs.
b3Do in-flight payment requests carry their idempotency key across the cutover?
Recommend: Yes — both versions read the same idempotency store, so a retry that lands on v3.2 after starting on v3.1 is de-duplicated rather than double-charged.
C

Window & comms

c1When do we run the cutover?
Recommend: Tuesday 02:00–04:00 UTC — the weekly traffic trough, far from the Friday settlement run and with the on-call team rested.
c2Do customers get a maintenance notice?
Recommend: No banner — the ramp is zero-downtime, so a notice would imply an outage that is not expected. Status page stays green unless a rollback fires.
Alternative: post a low-key "scheduled maintenance" window anyway.
c3Where does the launch crew report progress?
Recommend: One dedicated incident channel with a step-by-step checklist pinned; the release lead posts at each ramp gate so a watching exec sees progress without asking.
D

Ownership

d1Who holds the final go / no-go call at each gate?
Recommend: The release lead, with payments-eng and finance each holding a hard veto. Any one veto stops the ramp.
d2Who is paged if a rollback fires mid-ramp, and what is the secondary path?
Needs your input — no safe default. Name the primary on-call and the secondary escalation for the cutover window. Pre-set to Change.
0 changed / 0 decisions