Skip to main content

6.6 Root Cause Analysis (RCA)

In the context of Total Productive Maintenance (TPM), "fixing" the machine is merely the first step. The goal is not repair; it is Non-Recurrence. Root Cause Analysis (RCA) is the disciplined engineering disciplineforensic process used to eliminateconvert equipmenta breakdowns.failure "Fixing"into an asset of knowledge. If a machine restoresbreaks ittwice to operation; RCA analyzes why it failed to prevent it from failing again. This chapter mandatesfor the investigationsame protocolsreason, forthe unplannedmaintenance downtime,system—not aimingthe tomachine—has extend Mean Time Between Failures (MTBF) and eliminate chronic stops.failed.

Breakdown Analysis Triggers

Not every machine stopjam requires a fullboard engineering investigation.meeting. RCA is mandatedexpensive in terms of engineering hours; allocate it strictly forto "Major Breakdowns" that disrupt the facility'sthreaten capacity.

  • Duration TriggerTrigger:: Any unplanned equipment downtime exceeding > 60 minutesMinutes.
  • Frequency TriggerTrigger:: Any recurrenceRecurrence of the same error code or component failure on the same asset within 30 daysDays (Chronic Failure).
  • Cost TriggerTrigger:: Any failure requiring a spareSpare part replacement exceeding > $2,000 (e.g., Servo Amplifier, Vacuum Pump).

The "Physics of Failure" (5 Whys)

TPMDo requiresnot analyzingdescribe what happened; explain why the physical mechanismphysics of the failure,component not just the symptom.failed. The 5 Whys must trace the defect back to a systemic lapse in the maintenance system.strategy.

  • StandardStandard:: The analysis must moveMove from the Phenomenon (Bearing seized)Seized) to the-> Physical Cause (Lack of lubrication)Lubrication) to the-> Systemic Cause (PM scheduleSchedule missing or AM operator failed to check)Missing).
  • ExampleExample::
    1. Why did the machine stop? -> Z-axisAxis motorMotor overload.Overload.
    2. Why overload? -> Ball screw experienced high friction.
    3. Why high friction? -> Grease had hardened and contaminated.
    4. Why contaminated? -> Wiper seal was torn.
    5. Root Cause: Wiper seals were not included onin the annualAnnual PM replacementReplacement list.List.
  • ConstraintConstraint:: "Part Failure" or "Wear and Tear"Tear," "Old Age," or "Random Failure" are notforbidden acceptable root causes. You must explain why it wore out prematurely or why itthe wear was not caughtdetected before failure.

Maintenance Fishbone (4M Analysis)

For complex breakdowns where the physical cause is not obvious,ambiguous, use athe modified4M Ishikawaframework diagramto focusing strictly oninvestigate maintenance factors.variables.

    • MachineMachine:: IsWas the component rated for this load?duty cycle? Was it modified? Is there excessive vibration?
    • Man (Technician):: Was the last repair performed correctly?to torque spec? Was the torquetechnician spec followed?certified?
    • Method (PM Procedure):: Does the PM checklist explicitly cover this specific component?wear point? Is the frequency correct?(e.g., Monthly) sufficient?
    • Material (Spare Parts):: Was the replacement part OEM or a generic substitute? Was the lubricant shelf-life valid?

Corrective Action: Closing the Loop

A TPMAn RCA is only closed when the maintenanceinstitutional strategymemory is updated. The solutionoutput must fallbe intoa onestructural ofchange, threenot categories:a "reminder."

    1. 1. Design Change (MP - Maintenance Prevention):: ModifyingModify the machine to eliminate the weak point (e.g., replacing a belt drive with a direct drive, installinginstall a protective cover)cover over the wiper seal).
    2. 2. PM OptimizationOptimization:: AddingAdd the specific component to the Preventive Maintenance checklist or increasing theincrease inspection frequency.
    3. 3. AM Standard UpdateUpdate:: EmpoweringEmpower the operator to detect the early signs.signs (e.g., "Add visual check of oil gauge to Daily Start-up Checklist").

The Breakdown Report

AllPro-Tip: qualifyingA events"Corrective mustAction" bethat documentedsays in"Retrain Operator" is a "Breakdownfailure Report"of (leadership. If the operator failed, the process was not arobust generic QMS form).

  • Required Fields:
    • Time Down: Total minutes from stopenough to fullprevent speedhuman validation.
    • MTTR: Actual repair time vs. diagnosis time.
    • Phenomenon: Photo of the damaged part.
    • Countermeasure: Specific link to the updated PM/AM document ID.
error.

Final Checklist

Protocol

Parameter

MandateMetric / Rule

Critical State

RCA Trigger (Time)

Downtime Duration

> 60 Minutes

RCA Trigger (Repeat)

Chronic FailureFrequency

2x in 30 Days

AnalysisMethodology

MethodLogic Tool

5 Whys (Physics of Failure)

ConstraintForbidden Causes

UnacceptableInvalid CauseExplanations

"Wear and Tear" / "Old Age"

OutputClosure Criteria

DocumentAction UpdateRequired

MustUpdate update PM or PM/AM Checklist

OutputDesign Change

Design ChangeRequirement

Required ifIf PM is impossibleImpossible

MetricValidation

Success CriteriaMetric

Zero recurrenceRecurrence for (90 daysDays)