More Than One Way to Test Cloud Security. Theory vs Practice.
NIST SP 800-115, the federal technical guide that has structured security testing methodology since 2008, treats vulnerability scanning and penetration testing as separate disciplines: one enumerates exposure, the other validates what can be exploited from it. Different shapes of evidence, produced by different methods, answering different questions. Treating them as interchangeable is how blind spots persist.
The same principle applies, almost word-for-word, to the cloud-IAM versions of those disciplines that have emerged in the last five years. AWS Audit Manager's own CIS framework documentation says it bluntly: "The controls in these frameworks aren't intended to verify if your systems are compliant with CIS AWS Benchmark best practices. Moreover, they can't guarantee that you'll pass a CIS audit." (AWS Audit Manager docs) The tools that map your configuration to a control catalogue are not, themselves, the assessment of whether your environment is secure.
Cloud security testing decomposes into five disciplines. Three of them reason about your environment in theory: a rule engine, a control mapping, a graph of what should be possible. Two of them validate by acting: a human pentester who actually exploits the chain, an autonomous agent that actually walks the IAM graph from a starting foothold. The five categories, what they answer, and what they explicitly don't, are the subject of the rest of this post.
The five categories

The five aren't a maturity ladder. They're parallel disciplines, each answering a different question. A complete security program runs all five at appropriate cadences.
1. Cloud Security Posture Management (CSPM)
The question it answers: Where am I misconfigured against vendor and community best practices?
Gartner originated the term and defines CSPM as the discipline of continuously identifying and remediating misconfigurations and compliance risks across cloud infrastructure. The discipline anchors to two reference bodies: the Cloud Security Alliance's Cloud Controls Matrix v4, which enumerates 197 control objectives across 17 domains, and the AWS Well-Architected Framework Security Pillar, which frames the customer-side baseline through identity and access management, detection, infrastructure protection, data protection, incident response, and application security.
Execution is read-only API enumeration. The open-source Prowler project, a reference implementation of the discipline, runs over 600 AWS checks across 84 services via boto3 calls, evaluating each against declarative rules and mapping results to 44 compliance frameworks. No traffic touches workloads. No exploitation is attempted. The output is a per-resource pass/fail list with severity, framework mapping, and remediation text.
What it finds in real environments is enormous: per Datadog's 2025 State of Cloud Security, 59% of AWS IAM users with an access key have one older than a year, 19.4% of EC2 instances are overprivileged, 17.9% have excessive S3 access, and only 49% of EC2 instances enforce IMDSv2. CSPM surfaces this by reading config and applying rules.
What CSPM cannot tell you. Whether any of those findings are reachable, exploitable, or chained together into something an attacker would walk. Industry analysis treats attack-path analysis as a separate capability layered above CSPM inside CNAPP, specifically because posture findings alone don't prioritise by reach. Microsoft makes the architectural split visible in product: attack-path analysis is a distinct capability built on top of the posture and security-graph data in Defender for Cloud (Microsoft docs). Mitigant published a piece titled simply "CSPM scans are not cloud penetration tests". The title is the argument.
CSPM is a posture readout. It does not know whether your posture failures are the breach.
2. CIS Compliance Audit
The question it answers: Do I satisfy a specific written control framework?
The Center for Internet Security describes its benchmarks as "best practices for the secure configuration of a target system" and as "the only consensus-based, best-practice security configuration guides both developed and accepted by government, business, industry, and academia." The key word is configuration. A benchmark is a configuration document, not a security-outcome assessment.
For AWS, the relevant document is the CIS AWS Foundations Benchmark, currently at v7.0.0 on the CIS landing page. AWS Security Hub's certified implementation is at v3.0 with 37 controls. Controls split into Level 1 ("a base recommendation that can be implemented fairly promptly and is designed to not have an extensive performance impact") and Level 2 ("defense in depth ... intended for environments where security is paramount") (CIS FAQ).
The benchmark's own scope statement, quoted in AWS Audit Manager's documentation, is unambiguous: "a subset of AWS services with an emphasis on foundational, testable, and architecture agnostic settings." The services covered are IAM, AWS Config, CloudTrail, CloudWatch, SNS, S3, and default VPC. Application workloads, data, most managed services, and the IAM graph beyond per-resource settings are out of scope.
Above the benchmarks sits the broader CIS Critical Security Controls v8.1: 18 top-level controls broken into Safeguards, structured into Implementation Groups: IG1 (essential cyber hygiene), IG2 (built on IG1), IG3 (the full safeguard set). The controls map to NIST CSF 2.0, HIPAA, ISO/IEC 27001, PCI DSS, SOC 2, and MITRE ATT&CK (CIS mappings).
Execution at the benchmark level is largely automatable. AWS Audit Manager's prebuilt CIS v1.3.0 frameworks ship 32 automated and 5 manual controls for Level 1, and 49 automated and 6 manual for Level 1+2 (AWS docs). Prowler covers CIS as one of its 70+ compliance frameworks alongside 1,700+ security checks. The output is a per-control pass/fail report with an aggregate compliance percentage and framework cross-mappings.
What a CIS audit cannot tell you. Whether your environment is secure. The standards themselves say so. The AWS Audit Manager disclaimer is the most direct: "The controls in these frameworks aren't intended to verify if your systems are compliant with CIS AWS Benchmark best practices. Moreover, they can't guarantee that you'll pass a CIS audit. AWS Audit Manager doesn't automatically check procedural controls that require manual evidence collection."
The scope gap is large. Wiz attributes roughly 80% of 2025 cloud-intrusion initial access to vulnerabilities, exposed secrets, and misconfigurations, categories that overlap only thinly with the CIS AWS Foundations services. Wiz's Cloud Data Security Report 2025 found that 72% of cloud environments have publicly exposed PaaS databases that lack sufficient access controls, an exposure class entirely outside the Foundations Benchmark scope. A 100% CIS pass is meaningful as hygiene. It is not a security assessment, and the CIS, the AWS, and the Audit Manager documentation all say so.
3. Penetration Test
The question it answers: Can a skilled adversary achieve specific objectives in a bounded engagement?
NIST SP 800-115 §5.2 defines pentesting as testing in which assessors "mimic real-world attacks to identify methods for circumventing the security features of an application, system, or network" (NIST SP 800-115). The Penetration Testing Execution Standard (PTES) frames the engagement as a structured process covering "everything related to a penetration test, from the initial communication ... through intelligence gathering and threat modeling ... through vulnerability research, exploitation and post-exploitation, and finally to the reporting."
PTES specifies seven phases: Pre-Engagement Interactions, Intelligence Gathering, Threat Modeling, Vulnerability Analysis, Exploitation, Post-Exploitation, Reporting. NIST SP 800-115 §5.2.1 condenses the same work into four: planning, discovery, attack, reporting. OSSTMM adds a parallel structure across five channels and four modules. The point of all three: the engagement is bounded by a written Statement of Work and Rules of Engagement that fix target list, time windows, permitted techniques, and escalation contacts.
In AWS, the rules of engagement are also set by AWS itself. The AWS Customer Support Policy for Penetration Testing permits testing without prior approval against customer-owned EC2, NAT Gateways, ELB, WAF, RDS, Aurora, CloudFront, API Gateway, AppSync, Lambda, Lambda@Edge, Lightsail, Elastic Beanstalk, ECS, Fargate, OpenSearch, FSx, Transit Gateway, and Bedrock AgentCore. Prohibited outright: DoS, DDoS, simulated DoS or DDoS, port flooding, protocol flooding, request flooding, S3 bucket takeover, subdomain takeover, and DNS zone walking, hijacking, or pharming via Route 53. Red, blue, or purple team exercises involving C2, simulated phishing, malware testing, or DDoS simulation require submission via the Simulated Events form.
The output is the PTES-prescribed pair: a non-technical Executive Summary plus a Technical Report with reproducible exploit chains, evidence, CVSS or impact-rated severity, and remediation guidance. Typical engagements run in the five-figure range and take weeks of tester time rather than days.
What a pentest cannot tell you. Everything. NIST SP 800-115 §5.2 says it explicitly: "Penetration testing can be invaluable, but it is labor-intensive and requires great expertise to minimize the risk to targeted systems." That expertise has to be spent somewhere, which means a pentest samples your environment along the threat models defined in PTES Phase 3. It does not survey it.
Lizzie Moratti's ramp-up guide to AWS pentesting puts the practitioner-side gap directly: "A cloud pentest is usually not a requirement for a company to meet compliance goals. It gets treated as a luxury service." She also names the substitution it produces: "Shady pentest vendors will scan-and-rebrand while mislabeling their work as a pentest." The pentest discipline has a defined methodology. Buying a pentest does not guarantee you receive one.
Scott Piper's AWS Security Maturity Roadmap is the canonical reference for buyers and practitioners thinking about where pentesting sits in the broader testing portfolio.
4. Attack Path Analysis
The question it answers: Given the policies and trust relationships in my environment, what attack paths are theoretically possible?
This is the youngest discipline of the five. Gartner places it inside Continuous Threat Exposure Management (CTEM), the exposure-management category that has formed above CSPM in the last few years. There is no NIST SP, no CIS Benchmark, and no CSA framework that codifies attack path analysis the way they codify the older disciplines. The vendor literature carries the definitions.
The questions the discipline asks are the ones familiar from incident response: where could an attacker move from a compromised principal, and what would that movement reach. The term "blast radius" itself was borrowed from AWS's own operational vocabulary, where reducing the scope of a failed change has long been a design principle.
The execution model is what makes the discipline distinct. The tools read your IAM and resource policies and compute reachability statically. They do not call AWS as the principals they analyse. The output is the graph of paths a principal could take given the policies they hold.
The reference open-source tooling:
- PMapper / Principal Mapper (NCC Group). Builds an IAM authorisation graph and queries reachability between principals using policy reasoning. No state-changing API calls.
- Cloudsplaining (Salesforce). Static IAM policy analysis that flags least-privilege violations and data-exfiltration risks in a risk-prioritised HTML report.
- IAMSpy (WithSecure). An SMT solver that answers "can this principal perform this action?" over real IAM policies. Pure reasoning, no execution.
- CloudFox (Bishop Fox). Sits at the boundary. Calls real AWS APIs to enumerate the environment from a foothold, illuminating attack paths without walking them.
What attack path analysis cannot tell you. Whether the paths it computes actually work. Static policy reasoning is necessarily a model of how IAM behaves. Real evaluation involves resource policies, service control policies, session policies, permission boundaries, condition keys evaluated against runtime context, KMS grants, cross-account trust, and the order in which AWS itself decides authorisation. Any modelling tool that perfectly captures this is, in effect, a re-implementation of AWS's authorisation engine. Most simplify, which means false negatives where a theoretical path is blocked by something the model didn't represent, and false positives where the model thinks a path is blocked when it isn't.
This is the same boundary NIST SP 800-115 draws between vulnerability assessment and pentest. The static tool tells you what's possible. The validated tool tells you what works.
5. Executed Blast Radius Assessment
The question it answers: From this starting foothold, what does an attacker actually reach when they act?
This is the discipline that pairs with attack path analysis the way pentesting pairs with vulnerability assessment. The IAM graph isn't reasoned over in theory. It's walked. The tool calls sts:AssumeRole to validate trust edges. It executes privilege escalation chains. It follows credential chains where a secretsmanager:GetSecretValue unlocks a database password that becomes the next foothold. Resource-based policies are probed with deny-all sessions to test what's actually allowed. The output is the graph of paths the attacker really walked, not the paths the policies seemed to allow.
The reference catalogue for what to attempt began with Spencer Gietzen's 21 IAM privilege escalation methods at Rhino Security Labs in 2018, was extended in a follow-up and the community-maintained AWS-IAM-Privilege-Escalation repo. Bishop Fox's IAM Vulnerable lab models 31 paths. hackaws.cloud's own catalogue tracks 34, including techniques contributed by Plerion's research team. MITRE ATT&CK for IaaS covers adjacent ground at the tactic level: T1098.003 (Account Manipulation: Additional Cloud Roles), T1078.004 (Valid Accounts: Cloud Accounts), T1552.005 (IMDS credential theft), T1550.001 (Application Access Token use).
The tools that execute rather than reason:
- Pacu (Rhino Security Labs). AWS exploitation framework. The
iam__privesc_scanmodule auto-detects exploitable Rhino-style chains and executes them. - hackaws.cloud. Autonomous agent that starts from a foothold identity, traverses the IAM and trust graph, executes privilege escalation and lateral movement with cleanup commands attached to every change, and produces a live attack graph of the actually-walked paths. The LexisNexis breach analysis and the foothold selection guide describe the methodology in detail.
What executed blast radius cannot tell you. How the foothold was obtained. Whether your application has SQL injection. Whether your S3 buckets are public to the internet (CSPM territory). Whether your control catalogue is satisfied (CIS territory). The assessment starts after the foothold and stops at the IAM and resource boundary. Initial access and application vulnerabilities are someone else's discipline, by design.
The comparison
| Discipline | Question | Methodology anchor | Mode | Output | Cadence | Explicitly out of scope |
|---|---|---|---|---|---|---|
| CSPM | Where am I misconfigured? | CSA CCM, AWS Well-Architected | Static rule check | Misconfig list with compliance map | Continuous | Reachability, exploitability, attack chains |
| CIS audit | Do I satisfy a control catalogue? | CIS AWS Foundations Benchmark, CIS Controls v8.1 | Static control mapping | Pass/fail per control + percentage | Quarterly or audit-cycle | Application, workload, data, most managed services |
| Pentest | Can an adversary achieve objectives in this window? | PTES, NIST SP 800-115 | Validated sample | Exploit chains with severity | Annual + per major release | Survey-level coverage, anything outside Rules of Engagement |
| Attack path analysis | What paths are theoretically possible? | Gartner CTEM | Static graph reasoning | Theoretical paths from a principal | Continuous | Whether the paths actually work |
| Executed blast radius | What does an attacker actually reach? | Rhino + Bishop Fox privesc catalogues, MITRE ATT&CK for IaaS | Validated traversal | Walked paths from a foothold | Continuous or per-foothold-class | Initial access, application vulnerabilities |
Theory vs practice
The post's title is the post's argument. Three of the five disciplines reason about your environment in theory. Two of them validate by acting. The pairing structure is the same one NIST has used to separate vulnerability assessment from penetration testing since 2008:
Theory (reasoning) Practice (validation)
----------------- ---------------------
Vulnerability scanning --paired-- Penetration test
Attack path analysis --paired-- Executed blast radius
CSPM (no validated pair: covered by pentest)
CIS compliance (no validated pair: out of scope)
A vulnerability scan tells you the open ports and exposed CVEs. A pentest tells you which ones an attacker actually exploits, in what order, to reach what. Cloud-IAM testing decomposed the same way over the last five years. Attack path analysis tells you what paths your IAM graph theoretically allows. Executed blast radius assessment tells you which ones a foothold actually walks, with the cleanup commands attached.
The theoretical disciplines are cheaper, faster, and continuous. The validated disciplines are more expensive, slower, and bounded by either tester attention (pentest) or starting-foothold scope (executed blast radius). Both halves are necessary. Static reasoning gives you the inventory of what to verify. Validation tells you which inventory items are real.
The substitutions that keep failing
Five categories, twenty pairings, and the same handful of substitutions keep showing up in security programs:
"Our pentest is our compliance evidence." The pentest is sampled and time-bounded. It does not survey the environment, and NIST SP 800-115 says so explicitly. Compliance frameworks expect coverage. The pentest report fills the attacker realism gap, not the coverage gap. They are different evidence.
"Our CIS audit is our security assessment." The Audit Manager docs (already quoted) say it best. The benchmark is configuration hygiene for a subset of services. Wiz's data on the breach causes that sit outside Foundations Benchmark scope is the gap, made concrete.
"Our CSPM scan is our pentest." This is the scan-and-rebrand pattern Moratti named. CSPM enumerates posture. It doesn't validate exploitation. Calling the output a pentest report changes the cover sheet, not the discipline. Mitigant's article is the boundary statement.
"Our attack path analysis is our blast radius assessment." This is the new substitution, made possible by tooling that produces attack graphs without ever calling AWS as a principal. APA tells you what should be possible given the policies. It cannot tell you what AWS's authorisation engine actually permits under condition keys, session policies, SCPs, KMS grants, and cross-account trust evaluated at runtime. Two principals with apparently identical policies may have very different blast radii once everything resolves. The validated version is what closes that gap.
"Our pentest is our blast radius assessment." Pentests sample. They typically prove a representative chain from initial access to impact, not every reachable resource from every plausible foothold. The blast radius from each compute identity, CI deploy role, vendor-trusted role, and developer SSO permission set is a different graph. A pentest engagement does not have the budget to enumerate all of them. The continuous, foothold-scoped version is what closes that gap.
A complete program
The five disciplines are complementary, and a serious program runs all five:
- CSPM continuous. Read the posture against vendor and community baselines. Surface the configuration drift. Fix the bulk findings on a sprint cadence.
- CIS compliance audit quarterly or annually. Map current state to the control framework auditors and customers ask about. Track the percentage as a hygiene KPI, not a security KPI.
- Penetration test annually plus per major change. Validate realistic attacker objectives. Use the report for severity calibration and threat-model truth-checking. Buy from practitioners, not scan-and-rebrand vendors.
- Attack path analysis continuously. Use the static IAM graph tools to find the policy-level structural risks (over-permissive trust policies, wildcards,
iam:PassRolenext to compute that shouldn't have it) before they show up in production. - Executed blast radius assessment continuously, per foothold class. Pick the realistic footholds described in the foothold selection guide: the most-exposed internet-facing service role, the CI deploy role, the typical developer SSO permission set, the vendor-trusted role. Run executed assessment from each. Compare the graphs.
The cost shape favours this layered approach. CSPM and APA are inexpensive and continuous. CIS audit cost is dominated by the auditor cycle. Pentest cost is dominated by tester time. Executed blast radius cost is dominated by AWS API calls and assessment compute. The expensive disciplines validate what the cheap disciplines surface.
Key takeaways
Five disciplines, five questions. CSPM, CIS audit, pentest, attack path analysis, executed blast radius assessment. Each answers a different question, and the standards that define each one say so explicitly.
Three reason in theory, two validate by acting. CSPM, CIS audit, and attack path analysis read configuration and policies and apply rules or graph reasoning. Pentest and executed blast radius assessment call the APIs, walk the paths, and validate by acting.
The pairing structure is older than the cloud. Vulnerability scanning is to pentest what attack path analysis is to executed blast radius assessment. The cloud-IAM disciplines didn't invent this split; they inherited it.
Substitutions persist because each discipline produces something the others don't. Buyers want one document. The standards themselves keep being clear that one document is not the assessment. NIST SP 800-115 and the AWS Audit Manager disclaimer are the two most-quoteable confirmations.
A complete program runs all five. CSPM and APA continuous, CIS audit on the audit cycle, pentest annually plus on major change, executed blast radius continuous per realistic foothold class. The cheap disciplines surface candidates. The expensive disciplines validate them.
hackaws.cloud is the executed blast radius half of the pair. It starts from the foothold, walks the IAM graph by actually calling the APIs, and produces the attack graph of what was actually reached.
Ready to find out what an attacker actually reaches from your real footholds? Sign up for hackaws.cloud and run executed blast radius assessment against the identities a real attacker would land on.