Skip to main content

5.4 Root Cause Analysis (RCA)

In the context of Total Productive Maintenance (TPM), Root Cause Analysis is the engineering discipline used to eliminate equipment breakdowns. "Fixing" a machine restores it to operation; RCA analyzes why it failed to prevent it from failing again. This chapter mandates the investigation protocols for unplanned downtime, aiming to extend Mean Time Between Failures (MTBF) and eliminate chronic stops.

Breakdown Analysis Triggers

Not every machine stop requires a full engineering investigation. RCA is mandated strictly for "Major Breakdowns" that disrupt the facility's capacity.

  • Duration Trigger: Any unplanned equipment downtime exceeding 60 minutes.
  • Frequency Trigger: Any recurrence of the same error code or component failure on the same asset within 30 days (Chronic Failure).
  • Cost Trigger: Any failure requiring a spare part replacement exceeding $2,000 (e.g., Servo Amplifier, Vacuum Pump).

The "Physics of Failure" (5 Whys)

TPM requires analyzing the physical mechanism of the failure, not just the symptom. The 5 Whys must trace the defect back to a lapse in the maintenance system.

  • Standard: The analysis must move from the Phenomenon (Bearing seized) to the Physical Cause (Lack of lubrication) to the Systemic Cause (PM schedule missing or AM operator failed to check).
  • Example:
    1. Why did the machine stop? → Z-axis motor overload.
    2. Why overload? → Ball screw experienced high friction.
    3. Why high friction? → Grease had hardened and contaminated.
    4. Why contaminated? → Wiper seal was torn.
    5. Root Cause: Wiper seals were not included on the annual PM replacement list.
  • Constraint: "Part Failure" or "Wear and Tear" are not acceptable root causes. You must explain why it wore out prematurely or why it was not caught before failure.

Maintenance Fishbone (4M Analysis)

For complex breakdowns where the physical cause is not obvious, use a modified Ishikawa diagram focusing strictly on maintenance factors.

  1. Machine: Is the component rated for this load? Was it modified? Is there vibration?
  2. Man (Technician): Was the last repair performed correctly? Was the torque spec followed?
  3. Method (PM Procedure): Does the PM checklist cover this specific component? Is the frequency correct?
  4. Material (Spare Parts): Was the replacement part OEM or a generic substitute? Was the lubricant shelf-life valid?

Corrective Action: Closing the Loop

A TPM RCA is only closed when the maintenance strategy is updated. The solution must fall into one of three categories:

  • 1. Design Change (MP - Maintenance Prevention): Modifying the machine to eliminate the weak point (e.g., replacing a belt drive with a direct drive, installing a protective cover).
  • 2. PM Optimization: Adding the specific component to the Preventive Maintenance checklist or increasing the inspection frequency.
  • 3. AM Standard Update: Empowering the operator to detect the early signs. (e.g., "Add visual check of oil gauge to Daily Start-up Checklist").

The Breakdown Report

All qualifying events must be documented in a "Breakdown Report" (not a generic QMS form).

  • Required Fields:
    • Time Down: Total minutes from stop to full speed validation.
    • MTTR: Actual repair time vs. diagnosis time.
    • Phenomenon: Photo of the damaged part.
    • Countermeasure: Specific link to the updated PM/AM document ID.

Final Checklist

Protocol

Parameter

Mandate

Trigger

Downtime Duration

> 60 Minutes

Trigger

Chronic Failure

2x in 30 Days

Analysis

Method

5 Whys (Physics of Failure)

Constraint

Unacceptable Cause

"Wear and Tear" / "Old Age"

Output

Document Update

Must update PM or AM Checklist

Output

Design Change

Required if PM is impossible

Metric

Success Criteria

Zero recurrence for 90 days