Structured Problem Solving
The discipline: you don’t reach “solution” until you’ve found the root cause — not the first plausible cause, the root cause. The systemic condition that, if fixed, prevents recurrence. The tools in this page enforce that discipline.
A3 Thinking
Section titled “A3 Thinking”A one-page problem-solving document (named for 11×17” paper). The page constraint forces disciplined thinking before implementation.
Eight sections:
| Section | What it requires |
|---|---|
| 1. Theme / Problem Statement | Quantified. “Zone 3 mis-pick rate 1.8% vs ≤0.5% target, $2,400/week in re-pick cost” |
| 2. Background | Why it matters to the operation; client SLA; risk if trend continues |
| 3. Current Condition | Data and process map. Not anecdote — pick error counts by picker, SKU, shift, day |
| 4. Root Cause Analysis | Cannot proceed to countermeasures until this section is complete and agreed |
| 5. Countermeasures | Specific actions, not aspirations. Named, dated. “Separate SKUs A/B by ≥36 in by [date]“ |
| 6. Implementation Plan | Named individuals, specific dates. No collective nouns |
| 7. Results Verification | Compare post-implementation data to baseline. Build a checkpoint into the plan |
| 8. Follow-Up Actions | Standardize what worked; identify next PDCA cycle for residual gap |
When to use: When the same problem appears in the Tier 2 huddle for the third time. The discipline of completing sections 1–4 before touching section 5 is what separates A3 thinking from a meeting with notes.
Half an A3 is not an A3 — it’s an email with a fancy template.
5 Whys — All 5 Levels Deep
Section titled “5 Whys — All 5 Levels Deep”The trap: stopping at the proximate cause (usually Why 3), producing a local fix that doesn’t prevent the next occurrence.
Worked example — Zone 3 mis-picks:
| Why | Answer | Type of fix |
|---|---|---|
| 1. Why did picks go wrong? | Pickers selected wrong item from adjacent slot | Observable; most teams stop here |
| 2. Why wrong item? | Slot labels for SKUs A/B are similar, 6 inches apart on same shelf | Proximate cause; most DC problem-solving stops here |
| 3. Why adjacent slots? | Similar-looking SKUs were slotted adjacent | Getting closer |
| 4. Why no separation? | Q1 slotting logic had no check for item similarity before finalizing assignments | Process failure |
| 5. Why no similarity check? | Slotting process designed 5 years ago for velocity/cube; look-alike risk not identified because SKU proliferation added 340 new items to Zone 3 in 18 months | Root cause |
Fix at Why 2: Better labels, separate these two SKUs — 2 hours of work, helps Zone 3 for these two SKUs only. Fix at Why 5: Add a look-alike proximity check to the slotting process for every zone — one day of work, prevents this class of error across the building for every future slotting event.
That is the difference between a local fix and a systemic fix.
Fishbone (Ishikawa) — DC-Adapted Categories
Section titled “Fishbone (Ishikawa) — DC-Adapted Categories”Brainstorming tool for root cause generation. Forces consideration of multiple cause categories before selecting which to investigate. Prevents anchoring on the first plausible explanation.
DC-adapted 6 categories (vs. manufacturing 6M):
| Category | DC Examples |
|---|---|
| Manpower | Training gaps on seasonal temps; insufficient QC staffing during surge; supervisor vacation coverage with junior leads |
| Method | Scan-verify step not consistently followed during surge; wave release timing creating a rush; no standard work for look-alike zones |
| Material | SKU packaging changed — three items now look identical; item master dimensions not updated |
| Machine | RF scanner batteries dying mid-shift causing scan bypasses; print-and-apply applying labels at 3° skew → barcodes unscannable → manual override entries |
| Measurement | Returns classified by customer-stated reason (“wrong item”) rather than root cause (mis-pick vs. mis-ship vs. customer error) — masks whether this is a pick or pack problem |
| Environment | Poor lighting in Zone 4 creating read errors on look-alike packaging; 89°F in pack area during June affecting concentration |
The rule: The fishbone generates hypotheses — it does not solve problems. Build it in the meeting room; verify it on the floor. Branches with the most data support → inputs to 5 Whys.
Pareto Analysis
Section titled “Pareto Analysis”Always run the Pareto. If you’re having a CI prioritization conversation and nobody has a Pareto on the table, stop the meeting and go build one.
The 80/20 principle is empirically reliable enough in DC operations to treat as a starting assumption.
DC applications:
| Pareto of | What it reveals |
|---|---|
| Mis-picks by SKU | Typically 15–20 SKUs generate 70–80% of all mis-picks → targeted intervention only on those |
| Customer complaints by reason | Prioritizes between process failures (wrong item vs. missing item are different root causes) |
| Emergency replenishment trips by zone | Zone generating 3× the trips has a slotting problem (face qty, min/max triggers, or velocity shift) |
| Overtime hours by function | Function generating 55% of total overtime has a capacity or process mismatch not reflected in the schedule |
FMEA — Getting Ahead of Failures
Section titled “FMEA — Getting Ahead of Failures”Failure Mode and Effects Analysis is the proactive tool — use it before a new process goes live, not after the failure occurs.
Structure for each process step:
RPN = Occurrence (O) × Severity (S) × Detection (D) — each rated 1–10
- O: How likely the failure is to occur
- S: How bad the downstream effect
- D: How likely to be caught before causing damage (10 = nearly impossible to detect)
High-RPN DC examples:
| Failure Mode | O | S | D | RPN | Mitigation |
|---|---|---|---|---|---|
| WMS location mismatch during rack move | 6 | 8 | 5 | 240 | Cycle count validation before rack move is released |
| Barcode scan bypass during scanner fault | 7 | 7 | 4 | 196 | Scanner fault auto-triggers zone hold |
| Look-alike SKU slotted adjacent | 5 | 6 | 6 | 180 | Proximity check in slotting SOP |
Best FMEA applications in DC:
- New automation system go-live
- Slotting redesigns moving large numbers of SKUs simultaneously
- New carrier onboarding (label spec mismatch)
- Seasonal surge prep (temp workforce doubles in 3 weeks)
CAPA — Closing the Loop
Section titled “CAPA — Closing the Loop”Corrective Action and Preventive Action is the formal handoff from investigation to accountability.
Trigger: Root cause is agreed upon via 5 Whys or A3.
Two elements CAPA adds:
- Named owner — not “the team,” not “operations,” a specific person
- Verification step — someone independently confirms the action was taken and that it worked
“Did we do the action?” and “did the action fix the problem?” are different questions. The second one is the one that matters.
The CAPA tracker (maintained by CI engineer) lists: problem, root cause, action, owner, due date, verification method.
- Actions open at 30-day follow-up → on CAPA tracker
- Actions open at 60 days → escalate to DC manager agenda
- Actions open at 90 days → structural escalation
See Tier Huddle System for how CAPA tracker connects to the daily management cadence.
Basic content
Subscribe to read the rest
This article is part of our Basic library — practitioner-level guidance, frameworks, and decision tools written from real projects.
$9/mo Basic · $13/mo Pro · cancel anytime