What is a security architecture review?

A structured evaluation of a system's design against security objectives, conducted by walking its data flows, trust boundaries, identity model, and failure modes. Unlike penetration testing, which finds implementation flaws in a built system, an architecture review finds design flaws — often before the system exists, when fixing them costs a diagram change instead of a re-platform.

How is an architecture review different from threat modeling?

They overlap heavily and I treat threat modeling as the core engine of a review. Threat modeling systematically enumerates what can go wrong per element and boundary — STRIDE is the common framework. The architecture review wraps that in broader questions: identity and access design, operational reality, recovery paths, and whether the controls will survive contact with the team that operates them.

How long should a security architecture review take?

For a typical enterprise application or infrastructure design, plan two to four working sessions of about two hours, plus preparation and a written report — roughly a week of elapsed time. Reviews compressed into a single one-hour meeting produce checkbox findings. Reviews that run for a month produce reports nobody reads. The review is a forcing function, not a research project.

What makes a finding worth reporting?

A finding matters if it changes an attacker's cost or the organization's blast radius: an unauthenticated trust relationship, a flat management plane, a single credential controlling data and its backups. If a finding would not change what an intelligent attacker does or what the business loses, it is noise — record it elsewhere and keep the report short enough to be acted on.

Cybersecurity

How I Run a Security Architecture Review (With Checklist)

A working security architecture review method: data-flow-first analysis, trust boundary mapping, the questions that expose real risk, and a usable checklist.

By Pavel Glukhikh May 18, 2026 7 min read

Executive summary

A security architecture review is a structured examination of a system's design — its data flows, trust boundaries, identity model, and failure behavior — performed before or during build, when changes are still cheap. Done well, it finds the flaws no scanner will ever report: the management plane that bypasses every control, the trust boundary nobody drew, the recovery path that depends on the system being recovered. This article documents the method I use after years of running these reviews: why data flows come before everything else, how to map trust boundaries, which questions separate real findings from checkbox noise, and a checklist you can run this week.

Why most design reviews find nothing

I have sat through architecture reviews that were forty minutes of slideware followed by “any security concerns?” and a silence that got recorded as approval. And I have watched systems sail through formal review gates and then fail in production for reasons that were visible in the design the whole time — because the review checked for the presence of controls instead of tracing what actually happens to the data.

The failure mode is reviewing the inventory (“is there a WAF? is data encrypted?”) instead of the structure (“who can talk to what, as whom, and what happens when this component lies?”). Scanners and pen tests will find implementation bugs later. The architecture review is the only gate that can catch design flaws — and design flaws are the expensive ones, because the fix is a re-platform instead of a patch.

Security is architecture, not a department. The review is the point in the delivery process where that principle is either enforced or quietly waived.

Data flows first, always

Every review I run starts the same way: draw the data flows before discussing a single control. Not the deployment diagram — the flows. Where does data enter, what transforms it, where does it rest, who reads it, and how does it leave?

Controls are answers. Flows are the questions.

The order matters because controls only make sense relative to flows. Encryption “at rest and in transit” is a checkbox until you see that the transform service logs full payloads to a log platform in another trust domain — at which point the sensitive data has a fourth resting place nobody encrypted, retained for 90 days with different access control. I find a variant of that finding in most reviews, and no control-inventory question would ever surface it.

Practical rules for the flow-mapping session:

Make the system’s engineers draw it live. The hesitations are data: where the marker pauses, documentation has diverged from reality.
Chase every flow to its true endpoints. “It goes to the API” is not an endpoint; the API’s database, its log pipeline, and its backup target are.
Include the flows people forget: backups, log shipping, monitoring, patching, CI/CD, break-glass access. The management flows are where the worst findings live, in my experience — production interfaces get all the design attention while the management plane grows organically.
Note the classification of what moves on each flow. A diagram with flows but no data sensitivity cannot support any risk conclusion.

Trust boundaries: the lines that decide everything

With flows on the wall, draw the trust boundaries: every line where the level of trust changes — internet to DMZ, application to database, one identity domain to another, human to system, tenant to tenant, IT to OT. Every crossing needs a defensible answer to three questions:

Authentication — how does each side know who it is talking to?
Authorization — what is the crossing allowed to do, and is that the minimum it needs?
Validation — what happens when the other side sends something malformed or malicious?

Then the question that produces the best findings of the whole review: “what happens if this component is compromised — what does it reach?” Asked at each boundary, this is blast-radius analysis, and it reliably exposes the transitive trust nobody designed on purpose: the app server that holds credentials to three other systems, the shared service account that collapses two boundaries into one, the monitoring agent with root on both sides of a segmentation line that the segmentation design says is impermeable.

For systematic coverage at each boundary I run STRIDE quietly in my head — spoofing, tampering, repudiation, information disclosure, denial of service, elevation of privilege — as the OWASP threat modeling guidance describes. The framework matters less than the discipline of asking each category at each crossing rather than free-associating threats.

The questions that matter

Beyond the boundary mechanics, a handful of questions do most of the work in my reviews. These are the ones that change designs:

Where does identity come from, and what happens when it’s compromised or unavailable? Most designs assume the IdP is honest and up. The identity-first view says that assumption is exactly what the attacker will violate.
How does an administrator actually get in? Trace the real path — jump host, VPN, SaaS console, vendor remote access. The management path is an attack path with better privileges.
What are the secrets and where do they live? Every credential, key, and token: storage, rotation, and who can read it. “In the config file” is a finding.
How does this system get rebuilt? If the recovery path depends on the system being recovered — backup console authenticating against the directory being restored, runbooks stored inside the wiki that’s down — that circular dependency is a severity-one finding per everything in ransomware-resilient architecture.
What does this look like to operate? A control the on-call team will bypass under pressure is not a control. NIST SP 800-160’s systems-security- engineering view is blunt about this: trustworthiness is a property of the system in operation, not the design document.
What is intentionally out of scope, and who accepted that? The undocumented risk acceptance is the one that resurfaces in the post-incident review.

Findings that matter versus checkbox noise

The report is where reviews go to die.

A 60-finding report gets triaged into a spreadsheet and forgotten; a 6-finding report gets fixed. My filter for what makes the report: does this change an attacker’s cost, or the organization’s blast radius, or the recovery time? If none of the three, it is a note, not a finding. A review that buries the flat management plane under forty TLS-version observations has increased friction without reducing any uncertainty — the definition of bad security work.

Real finding	Checkbox noise
Backup platform authenticates against the domain it protects	TLS 1.1 enabled on an internal endpoint with no sensitive flows
Management VLAN flat across all tiers	Missing security header on a static marketing page
Service account with Domain Admin used by three apps	Password policy 12 chars where standard says 14
No tested restore path for the identity tier	Log retention 11 months against a 12-month standard

The right-hand column isn’t wrong, it’s just not architecture — route it to the hardening backlog and keep the review report short, severity-ordered, each finding stating the structural fix, not just the flaw.

The checklist

Run this as the skeleton of a review; the sessions above give the questions their teeth.

PREPARATION
[ ] Current architecture and data flow diagrams obtained (or drawn live)
[ ] Data classifications identified for each store and flow
[ ] Compliance/regulatory drivers listed
[ ] Previous review findings and their status pulled

DATA FLOWS
[ ] Every ingress and egress path traced to true endpoints
[ ] Data at rest locations enumerated — incl. logs, caches, backups, exports
[ ] Management/operational flows mapped (patching, monitoring, CI/CD, backup)
[ ] Third-party and vendor access paths identified

TRUST BOUNDARIES
[ ] Boundaries drawn on the flow diagram; each crossing enumerated
[ ] AuthN, AuthZ, and input validation defined per crossing
[ ] Blast radius asked per component: "if this is compromised, what does it reach?"
[ ] Transitive trust and shared credentials identified

IDENTITY & ACCESS
[ ] Identity provider dependencies mapped, incl. failure and compromise cases
[ ] Privileged/admin paths traced end to end; MFA placement verified
[ ] Service accounts inventoried: privilege, rotation, interactive logon
[ ] Break-glass access defined and independent of primary identity

NETWORK & PLATFORM
[ ] Segmentation zones consistent with the enterprise model
[ ] Management plane isolated from user/data planes
[ ] Internet-exposed services justified, inventoried, patch-owned
[ ] Underlying platform (hypervisor, orchestrator, cloud account) in scope

DATA PROTECTION
[ ] Encryption in transit and at rest, with key management ownership
[ ] Secrets storage and rotation defined; no credentials in code/config
[ ] Data retention and destruction defined, incl. logs and backups

RESILIENCE & RECOVERY
[ ] Backup isolation: separate identity, immutable copy, tested restore
[ ] No circular dependencies in the recovery path
[ ] RTO/RPO stated and validated by drill, not assertion
[ ] Logging survives compromise of the logged system (off-host, retained)

OPERATIONS & CLOSURE
[ ] Alerting defined for the abuse cases found above
[ ] Operational cost of each control assessed with the operating team
[ ] Out-of-scope items and risk acceptances documented with owners
[ ] Findings severity-ordered; each has a structural recommendation
[ ] Review decisions recorded as ADRs; re-review trigger defined

Make it a habit, not an event

A review is a snapshot; architectures drift. The systems that stay defensible are the ones where review triggers are wired into the delivery process — new trust boundary, new data classification, new external dependency — and where decisions land in architecture decision records the next reviewer can read. One well-run review per significant system per year, plus triggered re-reviews, beats any amount of annual-audit theater.

The checklist is the easy part. The actual method is the discipline of asking “what does this reach when it falls” at every boundary, every time — the same question a postmortem asks after the incident, moved to the one point in the lifecycle where the answer is still cheap to change. Technologies under review will keep changing. That question won’t.

Frequently asked questions

What is a security architecture review?: A structured evaluation of a system's design against security objectives, conducted by walking its data flows, trust boundaries, identity model, and failure modes. Unlike penetration testing, which finds implementation flaws in a built system, an architecture review finds design flaws — often before the system exists, when fixing them costs a diagram change instead of a re-platform.
How is an architecture review different from threat modeling?: They overlap heavily and I treat threat modeling as the core engine of a review. Threat modeling systematically enumerates what can go wrong per element and boundary — STRIDE is the common framework. The architecture review wraps that in broader questions: identity and access design, operational reality, recovery paths, and whether the controls will survive contact with the team that operates them.
How long should a security architecture review take?: For a typical enterprise application or infrastructure design, plan two to four working sessions of about two hours, plus preparation and a written report — roughly a week of elapsed time. Reviews compressed into a single one-hour meeting produce checkbox findings. Reviews that run for a month produce reports nobody reads. The review is a forcing function, not a research project.
What makes a finding worth reporting?: A finding matters if it changes an attacker's cost or the organization's blast radius: an unauthenticated trust relationship, a flat management plane, a single credential controlling data and its backups. If a finding would not change what an intelligent attacker does or what the business loses, it is noise — record it elsewhere and keep the report short enough to be acted on.