Skip to content

1.2 Interoperability and governance

A system architecture without governance is not truly an architecture; rather, it often devolves into a fragile topology of point-to-point connections. In a high-volume manufacturing environment, interoperability requires the discipline of defining clear boundaries and contracts. When System A writes directly into the database of System B, a boundary has been violated. When System A changes a message format and consequently crashes System B, a contract has been broken.

This chapter establishes the governing principles for how systems in your landscape (ERP, MES, SCADA) coexist. These rules serve as essential architectural constraints to ensure stability and scalability.

Fragile bridges must not be built. Decoupling must be enforced.

  • Prohibition: Do not allow an external system to execute INSERT, UPDATE, or DELETE directly on another system’s SQL database.
  • Why: This practice bypasses crucial business logic validation (for instance, verifying if a Part Number exists before attempting to create a Work Order) and can lead to orphaned or inaccurate records.
  • Guideline: All integration should occur via an Abstraction Layer, such as an API, Message Broker, or Enterprise Service Bus.

Rule 2: the “hub” vs. “mesh” decision.

Section titled “Rule 2: the “hub” vs. “mesh” decision.”
  • Constraint: Direct mesh connections (System A ↔ System B, A ↔ C, B ↔ C) should be avoided, as this approach scales with unnecessary complexity (N(N-1)/2).
  • Guideline: Implement a Hub-and-Spoke or Unified Namespace (UNS) pattern. Systems should publish events to a central broker (like MQTT or Kafka) or call a central API Gateway.
  • Benefit: When replacing a core system like the ERP, you only need to update a single connector at the Hub, rather than managing numerous distinct point-to-point scripts.

The ICD serves as the formal agreement between two architectural blocks. It is highly recommended that no integration code is written until the ICD is reviewed and approved.

  1. Transport Protocol: (e.g. HTTPS REST, MQTT, TCP Socket).
  2. Directionality: Who initiates? (Push vs. Pull).
  3. Authentication: API Key, OAuth, or Certificate.
  4. Schema Definition: The exact payload structure (JSON/XML).
    • Strict Typing: Quantity must be defined as Integer, not String.
    • Unit of Measure: Temperature must be defined as Celsius, not just 240.
  5. Error States: How does the system signal failure? (HTTP 400 vs 500).

Pro-Tip: ICDs must be stored in a Git repository alongside the code. They are living documents, not PDFs buried in SharePoint.

Without the ability to uniquely identify an object, effective control becomes impossible.

Names must not be invented. The physical hierarchy must be used to create logical namespaces.

  • Format: Site/Area/Line/Cell/Device
  • Example: MEX01/SMT/Line04/Pick & Place02/Feeder12
  • Why: This allows you to aggregate data logically. A query for MEX01/SMT/* returns all SMT performance for the site.
  • The Problem: Vendor Serial Numbers are not unique globally. A resistor reel from Vendor A might have the same ID as a capacitor reel from Vendor B.
  • The Mandate: Generate an Internal Unique Identifier (UID) at the point of entry (Receiving).
  • Implementation: UUID (e.g. 550e8400-e29b…) or a prefixed integer (UID-999999) must be used. This Internal UID must be used as the Primary Key in all database relations.

Distributed systems rely heavily on accurate timing. When clocks drift across different systems, the logic governing cause-and-effect can break down.

  • Master: Deploy a local Stratum 1/2 NTP Server in the OT Network.
  • Drift Tolerance: ±500ms max.
  • UTC Standardization:
    • Storage: All timestamps in databases and logs should be recorded in UTC (ISO 8601).
    • Display: Convert timestamps to Local Time only at the presentation layer (e.g. the Operator Screen).
    • Risk: Storing records in Local Time introduces the risk of duplicating or losing data during Daylight Savings Time transitions.

It must be assumed the network will fail. It must be assumed the API will change.

  • Rule: Never break the contract.
  • Implementation: Semantic Versioning must be used in the Endpoint.
    • “POST /api/v1/work-order” (Legacy)
    • “POST /api/v2/work-order” (New Feature)
  • Deprecation: Support for “v-1” must be maintained for a minimum of 6 months.
  • Scenario: The MES transmits a “Consumption” message to the ERP. The ERP successfully receives and processes it, but the Acknowledgement response is lost in transit. The MES, assuming a failure occurred, retries the transmission.
  • Risk: The ERP might deduct the materials twice, creating inaccurate shortages in the system.
  • Requirement: The receiving system should be Idempotent. It needs to inspect the “Message-ID”. When it recognizes that it has already successfully processed a specific message (like “Msg-101”), it should simply return a “Success” acknowledgment without executing the transaction a second time.
  • Constraint: Network partitions and momentary drops are systematically inevitable.
  • Mandate: All Edge Gateways and MES interfaces are required to buffer messages locally (via Disk or Queue) if the upstream connection is lost.
  • Recovery: Upon connection restoration, the buffer must be flushed in strict FIFO (First-In, First-Out) order to preserve the original sequence of events.

Final Checkout: Interoperability and governance

Section titled “Final Checkout: Interoperability and governance”
Governance PillarControl PointMandatory StandardEngineering Consequence
ContractICDSigned & VersionedPrevents “Tribal Knowledge” integrations that are unmaintainable.
TopologyDecouplingNo Direct SQL AccessProtects data integrity and allows independent system upgrades.
TimeNTP SyncUTC + Local NTPGuarantees accurate sequence of events for genealogy.
NamingNamespaceISA-95 StructuredEnables scalable analytics and clear asset management.
ResilienceRetry LogicIdempotent ReceiverPrevents double-counting inventory during network jitters.
VersioningAPI LifecycleExplicit (v1, v2)Prevents “Big Bang” deployments; enables safe rollouts.