Reference Architecture
OT Network Reference Architecture
Purdue-informed OT network reference: zones and conduits, industrial DMZ, unidirectional gateway options, secure remote access, and sensor placement.
Design summary
An OT network reference architecture is a zoned design, informed by the Purdue model and IEC 62443, that separates industrial control systems from enterprise IT while still supporting the data flows plants actually need. This design covers level-by-level zoning, an industrial DMZ as the only IT/OT crossing, where unidirectional gateways earn their cost, a brokered remote access path for vendors and engineers, and where to place passive monitoring sensors so you can see OT traffic without touching it. It reflects patterns I ran in a petrochemical environment and have assessed since.
Component stack
- OT firewall pair at the IT/OT boundary (IEC 62443 conduit enforcement)
- Industrial DMZ hosts: historian mirror, patch/AV staging, jump servers
- Unidirectional gateway (data diode) for high-consequence zones (optional)
- Purdue L2 cell/area zone switches with private VLANs
- Secure remote access broker (PAM/jump architecture, MFA, session recording)
- Passive OT monitoring sensors (Claroty/Nozomi/Dragos-class) on SPAN/TAP
- OT asset inventory fed from passive discovery
- Dedicated OT time source (GPS/NTP) and OT Active Directory or local auth
Purpose and requirements
This is a reference design for the network layer of an industrial site — a plant, a utility substation cluster, a manufacturing floor — where the control systems must keep running even when enterprise IT is having a bad day, and especially when enterprise IT has been compromised. I administered both the office and process-control networks at a petrochemical plant for four years; the design below is the shape I would build again, adjusted for what the tooling now makes practical.
In a plant, the network is not an IT service. It is part of the machine.
Requirements:
- The process survives IT compromise. No control-network dependency on enterprise services; ransomware in the office does not stop production.
- Business still gets its data. Historian values, OEE metrics, and batch records flow out without opening inbound paths.
- Vendors get in — one way only. Brokered, recorded, time-boxed remote access instead of the modem-and-TeamViewer archaeology most plants run.
- Visibility without touching. You can see what talks to what on the control network without installing agents on 15-year-old controllers.
Topology
ENTERPRISE (Purdue L4/L5)
ERP, email, corporate AD, internet
|
+---------------------+
| IT firewall |
+---------------------+
|
==== INDUSTRIAL DMZ (L3.5) ==========================
| historian MIRROR patch/AV staging remote |
| (replica, read) (files inspected) access |
| broker |
====================================================
|
+---------------------+ optional for high-consequence zones:
| OT firewall | [ data diode ] --> outbound-only
+---------------------+ historian replication
|
OT OPERATIONS (L3): plant historian, eng. workstations,
OT domain/auth, backup server, [monitoring collector]
|
+----------------------------+
| Cell/area zone firewalls | (or L3 switch ACLs per cell)
+------+--------------+------+
| |
CELL A (L2) CELL B (L2) SIS / SAFETY
HMI, SCADA HMI, PLCs ... (own zone, max
servers, PLCs drives, I/O isolation)
[SPAN]->sensor [SPAN]->sensor [TAP]->sensor
sensors ----> monitoring collector (L3) ----> SOC (outbound only)
Two firewall layers and a DMZ between enterprise and any controller. Nothing in L4/L5 talks to anything below L3.5 directly, ever.
Component roles
Industrial DMZ (L3.5). The only crossing between IT and OT, and the design’s center of gravity. Every service that both sides need lives here as a broker: a mirrored historian that enterprise users query (never the plant historian directly), a staging server where patches and AV signatures land and get checked before OT pulls them, and the remote access broker. The rule that keeps the DMZ honest: no connection traverses it end-to-end. Sessions terminate in the DMZ and new ones originate from it.
Paired firewalls. The IT-side firewall and OT-side firewall should ideally be different administrative domains (and in high-consequence sites, different vendors), so one misconfiguration or one exploited platform does not open both doors. The OT firewall enforces conduit policy in IEC 62443 terms: named zone pairs, explicit protocols, everything else denied and logged.
Cell/area zones (L2/L1). Each production cell or functional area is its own zone behind a small firewall or strict L3 ACLs. Controllers and I/O within a cell can chatter freely — that traffic is deterministic and latency-sensitive — but cell-to-cell flows are explicit. This is what contains an incident to one line instead of one plant, the same containment logic as my SCADA network design piece but applied at plant scale.
Safety instrumented systems. SIS gets its own zone with the tightest conduit on site — engineering access only, alarm/status flows outbound only. If any zone justifies a unidirectional gateway, it is this one.
Unidirectional gateways. A data diode replicates historian or alarm data outbound with physically no return path. Use it where consequence, not probability, drives the decision. Everywhere else, a well-run firewall conduit is the pragmatic choice — diodes make every new data flow an engineering project, and that friction is a real cost.
Remote access broker. VPN/ZTNA lands on a broker in the DMZ; MFA at the broker; then a recorded jump session (RDP/SSH proxied, not tunneled) to the target engineering workstation or asset. Vendor accounts are disabled by default and enabled per work order with an expiry. The full pattern is in OT remote access architecture.
Passive monitoring sensors. Deploy where the conversations are:
- SPAN off each cell/area switch — sees controller-to-HMI and controller-to-controller traffic, which is where protocol-aware detection pays off.
- TAP or SPAN on the L3/L3.5 conduit — sees everything crossing the IT/OT boundary.
- Engineering workstation VLAN — the highest-value, highest-risk segment: it is where legitimate PLC programming and malicious PLC programming look most alike.
Sensor management interfaces live in a monitoring zone at L3; the collector pushes outbound to the SOC. Sensors never get a leg in the control VLANs.
OT-local services. DNS, NTP (GPS-backed), authentication, and backups all exist inside OT. Shared corporate AD across the boundary is the most common convergence shortcut and the most reliable way to let an IT compromise walk into the plant.
Security model
| Boundary | Control | Posture |
|---|---|---|
| Enterprise → DMZ | IT firewall | Explicit services, TLS, authenticated |
| DMZ → OT | OT firewall | OT-initiated pulls preferred; deny-by-default |
| Cell ↔ cell | Zone firewall/ACL | Explicit conduits, alarm on deny |
| Anything → SIS | Dedicated conduit or diode | Engineering-only, logged, MFA |
| Remote user → asset | Broker + JIT accounts | Recorded, time-boxed, per work order |
Detection leans on the passive layer: new device on a cell VLAN, new controller write from an unexpected source, firmware download outside a maintenance window. In OT, change itself is the alert — these networks are deterministic enough that anomaly detection actually works.
Tradeoffs
| Decision | What you gain | What it costs |
|---|---|---|
| Full DMZ with brokered services | No end-to-end IT/OT sessions | Duplicate infrastructure (mirror historian, staging) |
| Data diode on high-consequence zones | Inbound physically impossible | Per-protocol replication engineering; cost |
| Per-cell zone firewalls | Incident contained to one line | More devices; conduit matrix to maintain |
| OT-local AD/DNS/NTP | Plant runs when IT is down or hostile | Second identity stack to patch and audit |
| Passive-only monitoring | Zero risk to the process | Blind to host-level activity; no response actions |
| JIT vendor accounts via broker | Auditable, revocable access | Slower vendor turnaround; needs a process that works at 3 a.m. |
Scaling and variations
Small site / single line: collapse cell firewalls into one OT distribution switch with ACLs, but keep the DMZ and both firewall layers even if they are modest boxes. The DMZ pattern is the part that does not shrink well — resist merging it into “a VLAN on the IT firewall.”
Multi-site enterprise: replicate the site pattern; interconnect sites over a WAN that lands in each site’s L3, never cell-to-cell across sites. Centralize the SOC and the sensor analytics, keep enforcement local so a WAN outage degrades visibility, not control.
Cloud analytics / IIoT: treat cloud-bound telemetry as another DMZ broker — an edge gateway in L3.5 that OT pushes to and that alone talks outbound. Direct-to-cloud connections from L2 devices re-create the flat network with extra steps; my OT/IT convergence strategy article covers where to hold that line.
Operations notes
- Asset inventory comes from the sensors. Passive discovery is the only inventory method that keeps itself current in OT. Reconcile it against the maintenance system quarterly; the differences are your unknown devices.
- Change windows are architecture. Conduit changes, firmware updates, and rule pushes happen in declared windows with rollback plans — the process side of the plant already thinks this way; the network should too.
- Drill the islanding scenario. Once a year, sever the IT/OT boundary deliberately and confirm the plant runs: local auth works, historians buffer, operators can do their jobs. The first time you test this must not be during a ransomware event.
- Backups live in OT and get restore-tested on spare hardware — engineering workstation images, PLC project files, historian data. PLC project files especially: they are the source code of the plant and are routinely found only on one laptop.
- Keep the conduit matrix as a living document — every zone pair, allowed protocols, and the business reason. It is simultaneously your firewall spec, your audit evidence, and your incident-response map.
The principle underneath all the zones and conduits is older than any of the standards that now describe it: keep the process able to run alone. Controllers on the floor will outlive two or three generations of the enterprise network above them, and every convergence decision should be made knowing that. IT architectures are rebuilt every few years. Plants are not — which is exactly why the boundary between them deserves this much engineering.
Frequently asked questions
- Is the Purdue model still relevant for modern OT?
- As a literal network diagram, it is dated — cloud analytics and IIoT punch holes in a strict hierarchy. As a zoning discipline, it remains the best starting point: it forces you to name your zones, define the conduits between them, and defend each crossing. Use it as a policy model, not a cabling diagram.
- When is a data diode worth it over a firewall?
- When the consequence of inbound traffic is unacceptable regardless of probability — safety systems, high-hazard processes, or regulatory drivers. A diode makes inbound flow physically impossible, which is a stronger claim than any rule set. The cost is that every protocol needs a replication scheme, so reserve diodes for the zones that justify the engineering.
- How should vendors remotely access OT equipment?
- Through one brokered path: VPN or ZTNA to a DMZ access broker, MFA, then a recorded jump session to the specific asset, enabled just-in-time and disabled after the work order closes. No persistent vendor tunnels, no TeamViewer-class tools on HMIs, no exceptions that bypass the broker — the second path becomes the incident.
- Will passive monitoring sensors disturb my control network?
- No — that is the point of passive placement. Sensors receive copies of traffic from SPAN ports or TAPs and transmit nothing into the control network. The risks to manage are SPAN oversubscription on busy switches and making sure the sensor's management interface lives in the monitoring zone, not the control zone it watches.