GitHub Merge Queue Expiry Extension Reapproval: Decision Flow for Prolonged Rollback Incidents (2026)
During severe rollback incidents, teams often set a short emergency bypass expiry, then hit a second decision point when the window is about to end. This is where process quality drops: some teams extend by habit, some restore too early, and others forget to record who approved what.
This guide gives a reapproval decision flow for expiry extension requests so you can protect auditability while still restoring service quickly under pressure.
Table of contents
1. Why extension decisions fail
Initial bypass approvals are usually documented. Extension approvals are where incidents drift. People assume the original approval still applies, even though the risk profile has changed.
| Failure mode | What it looks like | Risk |
|---|---|---|
| Silent extension | Team keeps bypass active without fresh PR comment | No auditable approval trail |
| Chat-only approval | Decision made in Slack but not mirrored to PR timeline | Split source of truth |
| Open-ended extension | "Until stable" without UTC cutoff | Long-lived policy drift |
| No restoration owner | Everyone assumes someone else will restore checks | Controls remain weakened after incident |
2. Extension decision flow (T-10 to T+5)
Use this timeline around the current expiry timestamp:
| Time window | Action | Owner |
|---|---|---|
| T-10 min | Post extension precheck: current impact, remaining rollback work, validation status | Requester |
| T-7 min | Decide path: restore baseline now, extend with reapproval, or escalate to incident lead | IC + service owner |
| T-5 min | If extending, post full extension evidence block with new UTC expiry | Requester |
| T-3 min | Collect dual reapproval in PR thread | Two approvers |
| T0 | Old approval expires automatically; new approval must already exist | All responders |
| T+5 min | Confirm controls still bounded and restoration plan is unchanged | Restoration owner |
3. Mandatory evidence for extension requests
Never post "need 30 more min" without context. Use this exact field set:
- Previous expiry (UTC): the active cutoff that is about to end.
- Why work is incomplete: concrete blocker (queue delay, validation lag, infra instability).
- Current production state: impact trend and risk if rollback remains partial.
- Updated validation evidence: latest run links and signal quality.
- Compensating controls active: canary percent, extra monitors, on-call coverage.
- New expiry (UTC): bounded, short, and explicit.
- Restoration owner: named person accountable for restoring baseline controls.
If any field is missing, reject the extension and either restore baseline policy or escalate for fresh incident command review.
4. Copy-paste extension macros
Macro A: Extension request
Expiry extension request (rollback bypass)
Incident: INC-2026-02-17-019
Rollback PR: #8462
Previous expiry (UTC): 2026-02-17T00:30:00Z
Reason extension required: canary error budget check still running
Current impact: checkout latency elevated, partial recovery in progress
Updated evidence links:
- workflow run: [run-7]
- canary dashboard: [dash-3]
Compensating controls active:
- canary 5%
- on-call watch: @oncall-payments
- synthetic probe every 60s
Proposed new expiry (UTC): 2026-02-17T00:50:00Z
Restoration owner: @carol
Reapproval requested from IC + service owner.
Macro B: Reapproval confirmation
Reapproval recorded.
Role: Incident Commander
Decision: Approved extension until 2026-02-17T00:50:00Z
Conditions:
- No further extension without fresh evidence block
- Baseline policy restore posted by restoration owner before expiry
Macro C: Extension denied
Extension denied.
Reason: missing updated validation evidence and unclear restoration owner.
Action:
- restore baseline protections at current expiry
- continue rollback via queue-safe baseline path
- escalate to incident lead if customer impact worsens
5. Guardrail metrics and anti-patterns
| Metric | Healthy | Escalate when |
|---|---|---|
| Extensions with missing dual approval | 0% | Any occurrence |
| Extensions with no restoration owner | 0% | Any occurrence |
| Average extension window length | 15-30 min | > 45 min |
| Extensions per incident | 0-2 | > 3 |
6. FAQ
When should we request expiry extension instead of restoring baseline policy?
Request extension only when active impact remains and rollback validation is still incomplete with documented evidence. If service is stable, restore baseline controls immediately.
Does every extension require fresh dual approval?
Yes. Every extension is a new risk decision and needs fresh reapproval before the previous expiry cutoff.
How long should one extension window be?
Keep extensions short, usually 15 to 30 minutes, so risk is reassessed frequently.
What evidence is mandatory in an extension request?
Previous expiry, reason work is incomplete, current impact, updated validation links, active controls, new expiry, and restoration owner.
What is the most common failure in extension handling?
Approvals happen in chat only and never reach the PR timeline, which breaks auditability and causes expiry confusion.