3.6 Data retention, legal hold, and audit export pack
Data has mass. Accumulating terabytes of high-frequency sensor data without a disciplined purge strategy often leads to system paralysis. Conversely, deleting compliance records prematurely creates significant legal liability. It is highly recommended to implement a Data Lifecycle Management (DLM) strategy that balances performance, cost, and legal risk.
Retention policy by data class
Section titled “Retention policy by data class”Not all bytes are created equal. Data must be partitioned into classes with distinct expiration dates.
Class a: compliance & genealogy (the “forever” data)
Section titled “Class a: compliance & genealogy (the “forever” data)”- Content: Serial Numbers, Pass/Fail Results, Operator IDs, RMA History.
- Retention: Aim for Warranty Period + 1 Year (Consumer) or up to 15 Years (Automotive/Medical).
- Action: Ensure these records are never deleted until the defined policy expires.
Class b: high-frequency telemetry (the “engineering” data)
Section titled “Class b: high-frequency telemetry (the “engineering” data)”- Content: Waveforms, Torque curves, Temperature profiles (at 100ms resolution).
- Retention: 12 Months is a typical standard.
- Logic: If an engineering team hasn’t analyzed a temperature spike within a year, the data is likely just noise. After 12 months, design the system to down-sample to “Min/Max/Avg” summaries and safely purge the raw, high-frequency rows.
Class c: IT & security logs (the “forensic” data)
Section titled “Class c: IT & security logs (the “forensic” data)”- Content: Login attempts, API calls, Firewall logs, User Access changes.
- Retention: 1 Year (Standard Cybersecurity requirement).
- Action: Rolling purge.
Class d: large binary objects (BLOBs)
Section titled “Class d: large binary objects (BLOBs)”- Content: AOI Images,
X-Ray Tiffs, PCB Schematics. - Retention: 6 Months (unless specifically linked to a confirmed Defect or failure analysis).
- Logic: Storing terabytes of images for purely “Good” results is an inefficient use of capital. Prioritize keeping the “Fail” images for the same duration as Class A data.
Storage strategy: hot, warm, cold
Section titled “Storage strategy: hot, warm, cold”Keeping 10 years of detailed history in the primary production SQL database should be avoided. This practice almost universally kills query performance.
- Hot (Tier 1): Production DB (NVMe SSD).
- Age: 0 – 12 Months.
- Purpose: Instant operations, dashboards, active reporting.
- Warm (Tier 2): Data Warehouse / Read-Replica (HDD).
- Age: 1 – 3 Years.
- Purpose: Monthly/Yearly analytics.
- Cold (Tier 3): Archive (S3 Glacier / Tape).
- Age: 3+ Years.
- Purpose: “In case of lawsuit.” Low cost, high latency (hours to retrieve).
- The Purge Script: Automate the data movement process. When a specific record exceeds 12 months in age, the database script should systematically move it to cold storage and delete it from the hot tier to preserve speed.
The legal hold (the “stop shredding” button)
Section titled “The legal hold (the “stop shredding” button)”When litigation or a major recall begins, automated purging becomes evidence destruction. A “Kill Switch” is required.
- Trigger: The Legal Department officially notifies IT regarding a specific Lot, Product Line, or Date Range.
- Mechanism:
- The administrator creates a “Legal_Hold” flag in the database.
- Logic: When the “Legal_Hold” flag is active, the system should automatically bypass the automated purge script, protecting the records.
- Scope: The hold must apply to all associated data (Emails, Logs, Genealogy, Telemetry).
The audit export pack
Section titled “The audit export pack”When an auditor (such as the FDA, ISO, UL, or a Customer representative) requests data, they are effectively testing your retrieval capability. They generally expect a coherent, readable story, rather than a raw SQL data dump.
The SLA: A robust operation should generate this comprehensive pack in < 30 Minutes.
The structure
Section titled “The structure”The “Export Pack” is a zipped folder containing four distinct artifacts for a specific Serial Number (SN):
1. The Genealogy Report (PDF)
- Tree view of all child components (Lots/SNs).
- List of all equipment used (Asset IDs).
- List of all operators involved (User IDs).
2. The Process History (CSV)
- Parametric data for every step.
- Columns: Step, Machine, Timestamp, Setpoint, Actual_Value, Result, Upper_Limit, Lower_Limit.
- Why CSV? So the auditor can load it into Minitab/Excel for their own analysis.
3. The Master Data Snapshot (PDF)
- Evidence of the configuration at that moment in time.
- BOM Revision and Process Routing Revision active on the production date.
4. The Compliance Certificates (PDF)
- Links to Calibration Certs for the tools used.
- Links to Training Records for the operators involved.
Validation: the “restore” drill
Section titled “Validation: the “restore” drill”Backups are theoretical until proven; Restores represent reality.
- Frequency: Conduct these drills Quarterly.
- Drill: The IT team selects a random, archived Lot from several years ago.
- Objective: Successfully retrieve the data from the Cold Storage environment and generate the complete Audit Export Pack.
- Fail Condition: Consider the drill a failure if the retrieval process takes > 24 hours or if the supplied data is corrupt.
Final Checkout: Data retention, legal hold, and audit export pack
Section titled “Final Checkout: Data retention, legal hold, and audit export pack”| Category | Metric / Control | Threshold / Rule |
|---|---|---|
| Class A | Compliance Retention | Keep Genealogy/Quality data for Warranty + 1 Year (Min). |
| Class B | Telemetry Purge | Purge raw high-freq sensor data after 12 Months. |
| Performance | Tiering | Move data > 1 year old to Cold Storage to protect Production Database speed. |
| Legal | Hold Mechanism | A “Legal Hold” flag comprehensively overrides all automated purge scripts. |
| Audit | Speed | Maintain an Audit Export Pack generation time of < 30 Minutes. |
| Format | Legibility | The final Export Pack includes both human-readable (PDF) and machine-readable (CSV) files. |
| Validation | Restore Test | Verify Cold Storage retrieval capability every 90 days. |