Skip to main content

6.6 Root Cause Analysis (RCA)

In Total Productive Maintenance (TPM), "fixing" the machine is merely the first step. The goal is not repair; it is Non-Recurrence. Root Cause Analysis (RCA) is the disciplined engineering forensic process used to convert a failure into an asset of knowledge. If a machine breaks twice for the same reason, the maintenance system—not the machine—has failed.

Breakdown Analysis Triggers

Not every jam requires a board meeting. RCA is expensive in terms of engineering hours; allocate it strictly to "Major Breakdowns" that threaten capacity.

  • Duration Trigger: Any unplanned downtime > 60 Minutes.
  • Frequency Trigger: Recurrence of the same error code or component failure within 30 Days (Chronic Failure).
  • Cost Trigger: Spare part replacement > $2,000 (e.g., Servo Amplifier, Vacuum Pump).

The "Physics of Failure" (5 Whys)

Do not describe what happened; explain why the physics of the component failed. The 5 Whys must trace the defect back to a systemic lapse in the maintenance strategy.

  • Standard: Move from Phenomenon (Bearing Seized) -> Physical Cause (Lack of Lubrication) -> Systemic Cause (PM Schedule Missing).
  • Example:
    1. Why did the machine stop? -> Z-Axis Motor Overload.
    2. Why overload? -> Ball screw experienced high friction.
    3. Why high friction? -> Grease hardened and contaminated.
    4. Why contaminated? -> Wiper seal was torn.
    5. Root Cause: Wiper seals were not included in the Annual PM Replacement List.
  • Constraint: "Wear and Tear," "Old Age," or "Random Failure" are forbidden root causes. You must explain why it wore out prematurely or why the wear was not detected before failure.

Maintenance Fishbone (4M Analysis)

For complex breakdowns where the physical cause is ambiguous, use the 4M framework to investigate maintenance variables.

  • Machine: Was the component rated for this duty cycle? Was it modified? Is there excessive vibration?
  • Man (Technician): Was the last repair performed to torque spec? Was the technician certified?
  • Method (PM Procedure): Does the PM checklist explicitly cover this specific wear point? Is the frequency (e.g., Monthly) sufficient?
  • Material (Spare Parts): Was the replacement part OEM or a generic substitute? Was the lubricant shelf-life valid?

Corrective Action: Closing the Loop

An RCA is only closed when the institutional memory is updated. The output must be a structural change, not a "reminder."

  1. Design Change (Maintenance Prevention): Modify the machine to eliminate the weak point (e.g., install a protective cover over the wiper seal).
  2. PM Optimization: Add the specific component to the Preventive Maintenance checklist or increase inspection frequency.
  3. AM Standard Update: Empower the operator to detect the early signs (e.g., "Add visual check of oil gauge to Daily Start-up Checklist").

Pro-Tip: A "Corrective Action" that says "Retrain Operator" is a failure of leadership. If the operator failed, the process was not robust enough to prevent human error.

Final Checklist

Parameter

Metric / Rule

Critical State

RCA Trigger (Time)

Duration

> 60 Minutes

RCA Trigger (Repeat)

Frequency

2x in 30 Days

Methodology

Logic Tool

5 Whys (Physics of Failure)

Forbidden Causes

Invalid Explanations

"Wear and Tear" / "Old Age"

Closure Criteria

Action Required

Update PM/AM Checklist

Design Change

Requirement

If PM is Impossible

Validation

Success Metric

Zero Recurrence (90 Days)