<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Luigi Iacuaniello on Medium]]></title>
        <description><![CDATA[Stories by Luigi Iacuaniello on Medium]]></description>
        <link>https://medium.com/@a3thinker?source=rss-348f3f89f3f4------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*QrKWsiU-M5f6x5Up7ymlIw.png</url>
            <title>Stories by Luigi Iacuaniello on Medium</title>
            <link>https://medium.com/@a3thinker?source=rss-348f3f89f3f4------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sun, 24 May 2026 05:47:17 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@a3thinker/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Continuous Threat Modeling for Internet-Scale Backend Platforms]]></title>
            <link>https://medium.com/@a3thinker/continuous-threat-modeling-for-internet-scale-backend-platforms-b256acfbcf6d?source=rss-348f3f89f3f4------2</link>
            <guid isPermaLink="false">https://medium.com/p/b256acfbcf6d</guid>
            <category><![CDATA[security-engineering]]></category>
            <category><![CDATA[cyber-security-awareness]]></category>
            <category><![CDATA[cybersecurity]]></category>
            <category><![CDATA[data-security]]></category>
            <category><![CDATA[software-engineering]]></category>
            <dc:creator><![CDATA[Luigi Iacuaniello]]></dc:creator>
            <pubDate>Sun, 11 Jan 2026 20:57:09 GMT</pubDate>
            <atom:updated>2026-01-11T20:57:09.448Z</atom:updated>
            <content:encoded><![CDATA[<p><em>If you work on internet-scale backends, you know very well that threat modeling (when present) is often just a deliverable: it becomes true for a few days and false for the entire lifecycle.</em></p><p><em>The thesis is simple: within an organization it must become an engineering practice: repeatable, connected to the backlog and to Architecture Decision Records; producing verifiable evidence such as tests, alerts and policies.</em></p><h3>Why internet-scale is different</h3><p>In modern systems the real topology changes faster than our ability to describe it — microservices, event-driven architectures, caching, cloud, dependencies.<br>The classic threat model fails, but not because it is wrong, it is slow.<br>Services are born, evolve, die quickly; feature flags alter classic flows; autoscaling acts on backpressure and queues; and new dependencies come into play immediately.<br>Asynchrony is a relevant element in these scenarios and it is important to ask: who authorized what? For how long is this operation valid?<br>The risk is no longer the single vulnerability but trust: who can perform certain operations, what happens when there is a breach or abuse at scale.<br>The dominant risks are systemic errors in trust boundaries, multi-tenancy, economic abuse/DoS, and compromise of the supply chain/control plane.</p><p>Therefore, the threat model must live in dynamism: in delivery.</p><p>Otherwise it is just literature for its own sake.</p><h3>The real rule: start modeling trust boundaries, not components</h3><p>The attack surface is almost always in the transition between services: delegated authentication, data transformations, retries, idempotency, or tenant propagation.</p><p>Each boundary is a point where assumptions change; each change is a new shared cache, a new provider, a new authentication mode, and repeating a (smaller) threat modeling cycle.</p><pre>External client → API Gateway / WAF   <br>Gateway -&gt; service-to-service / mesh   <br>Service -&gt;storage (DB, cache, object store)   <br>Service -&gt;event bus / queue   <br>Runtime -&gt;control plane (Kubernetes, IAM, CI/CD)   <br>App -&gt;external dependencies (payment, email, KYC, LLM provider) </pre><h4>Scope</h4><p>The first mistake is wanting to model everything; another classic mistake is having no output — alignment meetings have to end at some point, right?<br>The real and useful scopes that are needed and that work are:</p><ul><li>Single service: for targeted hardening or risky refactor;</li><li>Domain: to establish invariants (authZ, multi-tenant, logging);</li><li>End-to-end flow: to uncover trust boundaries and dependencies.</li></ul><p>Define in-scope and out-of-scope without ambiguity.</p><p>The minimum acceptable outputs are not discussions. They are:</p><ul><li>Backlog of controls (technical stories with acceptance criteria);</li><li>Test cases (positive and negative, including regressions);</li><li>Logging/audit and alerting requirements;</li><li>Incident response runbook with clear ownership.</li></ul><p>It is important to know what to deliver, but above all, what to protect.</p><h3>Asset inventory</h3><p>This is where asset inventory comes into play: what are we protecting?</p><p>The critical asset here is no longer the data, but above all the individual responsibilities: creating a new resource, moving money, deploying code, changing a configuration.</p><p>And this is where the following immediately come into play:</p><ul><li>Identities and tokens (users, services, sessions, JWT/opaque, refresh);</li><li>Secrets and keys (KMS/Vault, signing key, mTLS, webhook secret);</li><li>Sensitive data (PII, finance, metadata, logs);</li><li>High-impact business functions (payments, provisioning, admin, export);</li><li>Control plane (CI/CD, Kubernetes, IAM, registry, DNS, secrets manager).</li></ul><p>With the last point, many threat models fail: the infrastructure is not a trusted environment, therefore a pipeline compromise makes protecting only the application irrelevant.</p><h3>Security-oriented diagrams as a design review cycle</h3><p>Diagrams are generally not pure aesthetics, there is semantics, and they help identify points where the level of trust or security context changes.</p><p>It is important to highlight:</p><ul><li>Ingress/egress;</li><li>Real authN/authZ points;</li><li>Storage and caching (especially shared);</li><li>Asynchronous channels;</li><li>Data transformations.</li></ul><p>Generally you do not need too many:</p><ul><li><strong>DFD</strong> for boundaries and data handling;</li><li><strong>Sequence diagrams</strong> for authZ, replay, idempotency, race conditions;</li><li>A revised <strong>C4 model</strong>, with trust boundaries, egress and privileges highlighted.</li></ul><h3>STRIDE is only the beginning…</h3><p><strong>STRIDE</strong> is useful, but not enough. It does not account for operations, scale, and business. How do we apply threat modeling? Which ones do we apply?</p><p>My experience leads me to state that it is useful to apply threat modeling per trust boundary, not per single component, and to choose at most 5 top threats per boundary, not to reason about all possible threats, but about the priority ones that have:</p><ul><li>High impact if they happen;</li><li>A real probability of occurring, considering the system in its current state.</li></ul><p>A quick example: end-to-end multi-tenant flow:</p><pre>Client -&gt; Gateway -&gt; Service -&gt; DB/Redis -&gt; Kafka -&gt; Billing -&gt; Payment Provider </pre><p>Considering classic STRIDE, we would have:</p><ul><li><strong>Spoofing:</strong> JWT reused cross-tenant, service-to-service identity based on <em>trusted</em> headers, forged webhooks;</li><li><strong>Tampering:</strong> Kafka events produced by unauthorized parties, tenantId altered, out-of-sequence updates;</li><li><strong>Repudiation:</strong> disputes without an end-to-end correlatable audit trail;</li><li><strong>Information Disclosure:</strong> cache not tenant-aware, enumeration, logs with token/PII;</li><li><strong>DoS: </strong>retry storm, Kafka backlog, expensive endpoints that can be abused;</li><li><strong>Elevation of Privilege:</strong> jump from user to admin, from app to control plane.</li></ul><p>Integrating other models is necessary:</p><ul><li><strong>LINDDUN:</strong> great when dealing with PII, tracking, correlations and inference;</li><li><strong>PASTA (risk-centric process):</strong> links threats to business impact and to tests/countermeasures, useful for critical platforms and domains;</li><li><strong>Attack trees / abuse cases:</strong> particularly effective for business logic abuse and scenarios <em>how the attacker monetizes</em>;</li><li><strong>MITRE ATT&amp;CK:</strong> helps model intrusions, lateral movement and persistence (especially in the control plane);</li><li><strong>Kill chain / intrusion lifecycle:</strong> to reason about prevention, detection and response.</li></ul><h3>Multi-tenancy: the invariant is serious</h3><p>In B2B multi-tenant models, no operation can read or write outside the correct tenant.<br>That sentence is a testable property, explicitly modelable; in particular, tenant resolution (from token/claims, path, mapping) and strong binding to identity are important, then choose the isolation model (DB-per-tenant, schema-per-tenant, row-level, or application enforcement) with related failure modes.<br>You must include cache/queue (where leakage is frequent) and analytics/reporting (where joins and aggregations can accidentally mix tenants).<br>A flaw here is not a bug. It is a systemic breach.</p><h3>Abuse-driven modeling</h3><p>In systems at this scale, threat modeling must include the economics of the attack.</p><p>Typically: brute force/credential stuffing, enumeration, scraping, business logic abuse, and DoS (saturation is expensive).</p><p>Controls are often resilience-oriented:</p><ul><li><strong>Multi-level rate limiting</strong> (IP, account, tenant, endpoint class);</li><li><strong>Quotas and budgets</strong> per tenant with hard enforcement;</li><li><strong>Aggressive timeouts</strong>, <strong>backpressure </strong>and <strong>circuit breaker</strong>;</li><li><strong>Async jobs</strong> for expensive operations;</li><li><strong>Anti-automation protections</strong> (device fingerprinting, step-up auth, proof-of-work where it makes sense);</li><li>And for the control plane and the supply chain?</li></ul><p>Modern attacks ignore the application; in a serious threat model it is expected:</p><ul><li><strong>CI/CD:</strong> secrets, permissions, approvals;</li><li><strong>Kubernetes:</strong> least RBAC, admission policy, network policy;</li><li><strong>Registries and dependencies:</strong> base images, updates, provenance</li></ul><p>Here, the dear <strong>MITRE ATT&amp;CK</strong> mentioned a little above becomes real; it describes realistic techniques, defines detection and incident response.</p><p>Not modeling this means preventing everything. In cybersecurity this is false.</p><h3>Formalize everything, especially key decisions</h3><p>All architectural decisions that impact security must be written in a Security Architecture Decision Record.<br>In a context like this, the system drifts if tomorrow a new team “optimizes” something.</p><ul><li><strong>JWT vs opaque token</strong> (and revocation strategy, more complex for JWTs);</li><li><strong>Tenant isolation</strong> (app-only, DB RLS, double enforcement);</li><li><strong>Cache strategy </strong>(tenant-aware keying, invalidation, TTL, stampede protection);</li><li><strong>Egress policy and allow-listing</strong>;</li><li>Retry, idempotency and event deduplication.</li></ul><p>The development team owns the threat models and the security team provides framework and review.</p><h3>Do not incentivize bureaucracy…</h3><p>With every new boundary introduced, in the delivery process a scalable loop is useful, with micro-sessions of even a few minutes involving a small number of people, updating diagrams and having immediate outputs.</p><p>Each top threat introduced should produce at least:</p><ul><li>An implementable control (story);</li><li>A test/verification (automatic if possible);</li><li>Storable evidence (ADR, config, dashboard, runbook).</li></ul><p>And how to understand whether we are going in the right direction? With a <strong>Key Performance Indicator</strong> (KPI):</p><ul><li>% of changes that modify trust boundaries accompanied by a security ADR;</li><li>authZ/multi-tenant tests introduced;</li><li>time between new dependency and associated policy;</li><li>recurring incidents for the same root cause;</li><li>control plane drift.</li></ul><h3>Conclusion</h3><p>In this context, performance and security are the same thing: a cache, a retry can introduce new attack surfaces.<br>Today’s threat modeling is driven by trust boundaries, produces code, tests and above all decisions.<br><strong>Documentation files age very quickly.</strong></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=b256acfbcf6d" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>