Skip to content
Your Bookmarks
    No saved pages. Click the bookmark icon next to any article title to add it here.

    2.4 Deployment architecture

    Cloud-first architectures fail on the factory floor because the internet is not a real-time control network. Latency, jitter, and outages are physical realities. The operational architecture must follow the “Submarine Principle”: the factory must be able to operate autonomously, retaining all data integrity, even when cut off from the outside world.

    It is strongly recommended to avoid connecting high-frequency machine telemetry directly to a central database. The bandwidth utilization is often inefficient, and the associated latency can disrupt real-time operations. The preferred approach is to deploy Edge Collectors (such as Industrial PCs or specialized Gateways) directly at the machine level (Level 1/2).

    Responsibilities of the Edge:

    1. Poll: Query the PLC at high frequency (10ms - 100ms).
    2. Normalize: Convert raw register 40001 to Oven_Temp_Zone1.
    3. Filter: Report “Change by Exception” (Deadband) to reduce noise.
    4. Buffer: Store data locally if the uplink fails.
    • Complex Machines (e.g. SMT, CNC): Allocate 1 Edge Gateway per Machine.
    • Simple Assets (e.g. Conveyors, Scales): Allocate 1 Edge Gateway per Line (aggregating data from multiple IO blocks).

    Genealogy relies on chronology. If Machine A thinks it is 12:00:00 and Machine B thinks it is 11:59:50, you cannot prove which process happened first. Windows Time is insufficient for industrial precision.

    • Protocol: Use NTPv4 (Network Time Protocol).
    • Source: Rely on a Local Stratum-2 Server (linked to GPS or an Atomic Clock). It is not advisable to depend on public internet pools (like pool.ntp.org) for the isolated OT network.
    • Drift Tolerance: Maintain a maximum deviation of ±500ms.
    • When the time offset exceeds 500ms, the system should flag the Data Quality as “Suspect”.
    • When the offset exceeds 2000ms, the system should trigger a Maintenance Alert. This significant drift typically indicates a hardware issue, such as a failing CMOS battery on the IPC.

    The network will fail. When it does, data must flow into a local reservoir, not onto the floor.

    • Target: Design for a minimum of 72 Hours of local retention. (This is generally sufficient to survive a weekend network outage).
    • Storage Medium: Utilize an Industrial SSD (with a High TBW rating) or a localized SQLite database.
    • Reconnection Logic:
      1. LIFO (Last In, First Out) for Status: The dashboard requires the most current state information immediately upon reconnection.
      2. FIFO (First In, First Out) for History: The system should then backfill the historical data gaps in strict chronological order.

    Data loss strategy (the “full disk” scenario)

    Section titled “Data loss strategy (the “full disk” scenario)”
    • When the local buffer exceeds 90% capacity, the system should trigger a Critical IT Ops Alert.
    • When the buffer reaches 100% capacity:
      • Traceability Data (Serial #s, Pass/Fail): The system should safely stop the line immediately. Compliance data is critical and should not be inadvertently discarded.
      • Telemetry Data (Amps, Volts, Temps): The system should overwrite the oldest telemetry data first, following a Ring Buffer protocol.

    An Edge Collector that has silently crashed is worse than no collector at all. A “Heartbeat” mechanism is required.

    • Heartbeat: The Edge device sends a “Keep-Alive” pulse regularly (e.g. every 60 seconds).
    • Latency Check: The centralized system measures the Time_Sent against the Time_Received.
    • Resource Thresholds:
      • CPU: An alert must be triggered if utilization is > 80% for more than 15 minutes.
      • RAM: An alert must be triggered if utilization is > 90%.
      • Disk: An alert must be triggered if Free Space drops below 20%.

    Recap: Edge Infrastructure Deployment Parameters

    Section titled “Recap: Edge Infrastructure Deployment Parameters”
    ComponentParameterRequirementAction on Violation
    Edge CollectorPolling Frequency10ms - 100ms
    Time Synchronization (NTP)Clock Drift≤ ±500ms>500ms: Flag data as “Suspect”. >2000ms: Trigger Maintenance Alert.
    Local BufferRetention Capacity≥ 72 hours>90% capacity: Trigger Critical IT Ops Alert. 100% capacity: Halt line for traceability data; overwrite oldest telemetry.
    Edge HealthHeartbeat Interval60 secondsMissing pulse: Alert for collector failure.
    Edge ResourcesCPU / RAM / DiskCPU ≤80% (15-min avg)
    RAM ≤90%
    Disk Free ≥20%
    Trigger alert for threshold violation.

    Сообщение об ошибке