6.1 Maintenance governance: KPIs, roles & escalation
Total Productive Maintenance (TPM) is not a glorified cleaning schedule; it is the strict discipline of Asset Utilization. In high-volume electronics, a machine sitting idle due to unplanned downtime is simply wasting capital. We must deliberately shift the operational model from “Repair when Broken” to “Monitor to Prevent.” The goal is not merely to fix machines, but to continuously stabilize our process capability (Cₚₖ) so that manufacturing yield remains a constant, highly predictable metric.
Overall equipment effectiveness (OEE)
Section titled “Overall equipment effectiveness (OEE)”OEE is the primary metric of absolute truth on the factory floor. It exposes the “hidden factory” of slow cycles, minor jams, and micro-stops that operators often ignore. Our facility target for critical SMT assets should always be > 85%.
OEE = Availability × Performance × Quality
1. Availability (a)
Section titled “1. Availability (a)”- Definition: The ratio of actual Run Time compared to Planned Production Time.
- Changeover Limit: If a changeover (SMED) takes > 15 minutes, the process needs immediate engineering review. You must train the team to pre-stage all necessary feeders, carts, and stencils off-line while the machine is actively running the previous batch.
- Material Shortages: If a material shortage unexpectedly stops the line, log it accurately as a “Logistics Loss,” not a “Maintenance Loss.” Faulty categorization actively prevents us from solving the real root problem.
2. Performance (p)
Section titled “2. Performance (p)”- Definition: Net Speed achieved versus the Designed Cycle Time (the machine’s nameplate capacity).
- Speed Derating: If a machine is purposefully throttled to < 95% of its rated speed, formal engineering justification is required. Simply running a chip shooter at 80% to arbitrarily “save the nozzles” only masks the root cause of the nozzle failure (which is often vacuum leaks or filter contamination). Fix the vacuum system and restore the rated speed.
3. Quality (q)
Section titled “3. Quality (q)”- Definition: First Pass Yield (FPY) of good, salable units.
- Yield Drops: If FPY at the AOI station drops < 98.5%, immediately stop the line. Producing bad boards faster technically improves OEE Performance, but it completely destroys OEE Quality. The net output of usable product is zero, and you are simply accelerating expensive scrap.
Pro-Tip: Try not to accept generic “Idle” as a status code. Configure the MES to force the operator to select a specific, actionable root cause code (e.g. “Waiting for Parts”, “Nozzle Jam”) before the machine is allowed to restart.
The pillars of TPM
Section titled “The pillars of TPM”Maintenance is a tiered, shared responsibility model, not just a siloed department you frantically call when things eventually break.
Pillar 1: autonomous maintenance (AM)
Section titled “Pillar 1: autonomous maintenance (AM)”- Primary Owner: The Machine Operator.
- Logic: The person standing closest to the machine must detect the initial baseline drift long before it causes a hard failure.
- Shift Start Mandate: Clean all optical sensors and transport rails.
- Weekly Mandate: Visually inspect linear guides and visually verify bearing grease levels.
- Tagging Protocol: Apply a physical abnormality tag to any audible air leak, abnormal bearing noise, or loose component for prompt technician review.
Pillar 2: planned maintenance (PM)
Section titled “Pillar 2: planned maintenance (PM)”- Primary Owner: The Skilled Technician.
- Logic: Systematically restore physical assets to optimal operating condition based on active usage intensity, not calendar days.
- Trigger Mechanics: PMs must be triggered by Run-Hours or Cycle Counts (e.g. 1,000 operational hours), never simply by “Months.” A machine running 24/7 wears out three times faster than one running a single shift.
- Parts Replacement: Preemptively replace consumable pneumatic filters and mechanical belts based on their MTBF (Mean Time Between Failure) specifications, ideally before they snap and halt the line in production.
Pillar 3: focused improvement (kobetsu kaizen)
Section titled “Pillar 3: focused improvement (kobetsu kaizen)”- Primary Owner: A Cross-Functional Team (Process Engineering + Maintenance).
- Logic: Aggressively eradicate chronic, repeating losses.
- RCA Trigger: Any incident of Unplanned Downtime > 60 minutes should trigger a mandatory Root Cause Analysis (RCA).
- Corrective Output: The required output is a permanent hardware change or a software interlock (Poka-Yoke) to prevent recurrence. “Retraining the operator” is rarely a valid, long-term corrective action.
Downtime escalation matrix
Section titled “Downtime escalation matrix”Escalation is primarily about rapidly unlocking resources to shorten the Mean Time To Recovery (MTTR). The local technician owns the physical repair, but management owns the removal of organizational barriers.
- Level 1: Tactical Support (15 Minutes)
- Trigger: Machine Down > 15 Minutes.
- Owner: Notify the Maintenance Lead.
- Action: The Lead assesses if the floor technician immediately needs additional assistance, advanced diagnostic tools, or schematic support.
- Level 2: Resource Allocation (60 Minutes)
- Trigger: Machine Down > 60 Minutes.
- Owner: Notify the Operations Manager.
- Action: The Manager formally authorizes expedited shipping for spare parts, approves emergency overtime, or triggers the critical decision to completely re-route production to alternative lines.
- Level 3: Strategic Response (4 Hours)
- Trigger: Machine Down > 4 Hours.
- Owner: Notify the Plant Director.
- Action: Activate the Business Continuity Plan (BCP). The Director assumes direct responsibility for critical client communication regarding schedule and delivery impacts.
Digital tracking
Section titled “Digital tracking”Manual logbooks are data graveyards. If the machine state is not recorded directly into the precise MES structured database, it cannot be tracked and analyzed.
- Connectivity: All critical SMT and Reflow assets must digitally push live state codes, speeds, and error logs directly to the central server.
- Visualization: OEE composite scores should be highly visible on line-side Andon boards in real-time.
Final Checkout: Maintenance governance: KPIs, roles & escalation
Section titled “Final Checkout: Maintenance governance: KPIs, roles & escalation”| Parameter | Metric / Rule | Critical State |
|---|---|---|
| OEE Target | Composite Score | > 85% |
| Changeover (SMED) | Duration | < 15 Minutes |
| Performance | Speed Derating | Prohibited (< 95%) |
| Quality | FPY Target | > 98.5% |
| Escalation L1 | Notify Lead | > 15 Minutes |
| Escalation L2 | Notify Manager | > 60 Minutes |
| RCA Trigger | Downtime Duration | > 60 Minutes |
| Data Logging | Method | Auto-MES (No Paper) |