6.6 Root cause analysis
In a mature
Breakdown analysis triggers
Section titled “Breakdown analysis triggers”Not every minor equipment hiccup requires an in-depth, multi-hour investigation. Because a thorough RCA requires dedicated engineering hours, it must be allocated to major breakdowns that materially threaten the production schedule.
- Duration Trigger: Any instance of unplanned downtime that lasts > 60 Minutes.
- Frequency Trigger: The recurrence of the exact same error code or identical component failure within a 30 Day window (This indicates a Chronic Failure).
- Cost Trigger: Any spare part replacement that costs > $2,000 (e.g. a massive Servo Amplifier or a primary Vacuum Pump).
The physics of failure (5 whys)
Section titled “The physics of failure (5 whys)”Describing what broadly happened is insufficient; exactly why the physics of the specific component failed must be explained. The
- Standard: The analysis must always move linearly from the Phenomenon (Bearing Seized) -> Physical Cause (Lack of Lubrication) -> Systemic Cause (PM Schedule Missing or Ignored).
- Example:
- Why did the machine stop? -> The Z-Axis Motor threw an Overload alarm.
- Why overload? -> The vertical ball screw experienced extremely high friction.
- Why high friction? -> The lubricating grease hardened and became heavily contaminated.
- Why contaminated? -> The protective wiper seal was torn.
- Root Cause: Wiper seals were erroneously omitted from the Annual PM Replacement List.
- Constraint: Terms like “Wear and Tear,” “Old Age,” or “Random Failure” are completely unacceptable root causes. It must be explained exactly why the part wore out prematurely or crucially, why that specific wear was not detected by the systems before the failure actually occurred.
Maintenance fishbone (4M analysis)
Section titled “Maintenance fishbone (4M analysis)”For highly complex breakdowns where the initial physical cause is ambiguous, the structured 4M framework should be used to thoroughly investigate all maintenance variables.
- Machine: Was the failed component actually rated for this specific, heavy duty cycle? Was it improperly modified? Is there undiagnosed excessive vibration transferring from a neighboring subsystem?
- Man (Technician): Was the last repair performed to the exact factory torque specification? Was the technician actively certified for this nuanced procedure?
- Method (PM Procedure): Does the current Preventive Maintenance (PM) checklist explicitly and clearly cover this specific wear point? Is the current frequency (e.g., Monthly) sufficient for the actual machine run-hours?
- Material (Spare Parts): Was the installed replacement part a genuine OEM component or a cheaper generic substitute? Was the grease or chemical lubricant used still within its valid shelf-life?
Corrective action: closing the loop
Section titled “Corrective action: closing the loop”An RCA is only truly closed when the facility’s institutional memory is permanently updated. The required output must be a structural change, not merely a weak “reminder to be more careful next time.”
- Design Change (Maintenance Prevention): The machine hardware must be modified to completely eliminate the weak point (e.g., proactively install a protective steel cover directly over the vulnerable wiper seal).
- PM Optimization: The specific, failed component must be formally added to the Preventive Maintenance checklist or the required inspection frequency increased based on the newly discovered wear rate.
- AM Standard Update: The line operator must be empowered to easily detect the early warning signs (e.g., “Add a quick visual check of the critical oil gauge to the Operator’s Daily Start-up Checklist”).
Pro-Tip: A
Final Checkout: Root cause analysis (RCA)
Section titled “Final Checkout: Root cause analysis (RCA)”| Parameter | Metric / Rule | Critical State |
|---|---|---|
| RCA Trigger (Time) | Duration | > 60 Minutes |
| RCA Trigger (Repeat) | Frequency | 2x in 30 Days |
| Methodology | Analysis Tool | |
| Forbidden Causes | Invalid Explanations | ”Wear and Tear” / “Old Age” |
| Closure Criteria | Action Required | Update Preventive Maintenance/Asset Management Checklist |
| Design Change | Requirement | If Preventive Maintenance is Impossible |
| Validation | Success Metric | Zero Recurrence (90 Days) |