Security Audits That Find Real Bugs: A CVE-Grounded Approach
Most security audits produce shelfware. This post shows how to run audits grounded in real vulnerability patterns — with specific CVEs, attack chains, and a triage model that produces fixes instead of reports.
Key Takeaways
- Audit findings should reference specific vulnerability classes (CWE IDs) and known exploitation patterns, not vague risk language.
- JWT algorithm confusion alone produced CVEs across five language ecosystems in 2023-2024: Go, Ruby, Python, C, PHP.
- The difference between a useful audit and shelfware is whether findings are specific enough to become a code change.
The shelfware problem
Most security audit reports share a structure: executive summary, risk matrix, 40-page findings list, remediation recommendations. They get delivered as PDFs. They get filed. They don't get fixed.
The reason is specificity. A finding that says 'authentication implementation has weaknesses' gives a developer nothing to act on. A finding that says 'this file accepts JWTs with algorithm=none because the decode call doesn't restrict allowed algorithms, which is the same pattern as CVE-2024-54150' gives them a one-line fix and a reason to care.
The difference between a useful audit and theater is whether the output is specific enough to become a pull request.
JWT vulnerabilities: one class, five ecosystems, same mistake
JWT algorithm confusion is instructive because the same logical error — failing to restrict which signing algorithms are accepted — produced critical CVEs across nearly every language ecosystem in 2023-2024:
CVE-2024-54150 (cjwt, C): Algorithm confusion allows HMAC verification bypass when asymmetric keys are expected. CVE-2023-51774 (JSON::JWT, Ruby): Sign/encryption confusion allows identity bypass. CVE-2024-37568 (Authlib, Python): HMAC verification with any public key due to algorithm confusion. CVE-2024-51744 (golang-jwt, Go): Unclear error handling allows invalid tokens to be accepted. CVE-2021-46743 (Firebase PHP-JWT): Algorithm confusion when multiple key types are loaded.
Every one of these is the same conceptual bug: the code doesn't enforce which algorithm it expects. The fix in every case is the same: pass an explicit allowlist to the verification function. Yet this pattern keeps appearing because developers (and AI tools) copy the 'simple' decode example from documentation without the algorithm restriction.
What a grounded audit actually checks
Instead of generic categories, a useful audit maps your codebase against known exploitation patterns:
- Authentication: JWT algorithm restriction, signature verification, token expiry enforcement, session invalidation on password change
- Authorization: middleware coverage on admin/internal routes, IDOR via predictable resource IDs, privilege escalation through role manipulation
- Input handling: SQL injection (even with ORMs — raw queries exist), path traversal in file operations, SSRF in URL-accepting endpoints, deserialization of untrusted data
- Secrets: hardcoded credentials, API keys in client bundles, signing keys in version control, environment variable leakage in error responses
- Dependencies: runtime-reachable CVEs (not just advisory counts), transitive exposure through deep dependency chains
Triage by exploitability, not just severity
CVSS scores are a starting point, not a priority queue. A CVSS 9.8 in a dev-only dependency that never runs in production is less urgent than a CVSS 7.2 in your authentication middleware.
Useful triage asks: Is the vulnerable code reachable from an external request? Does exploitation require authentication? What's the blast radius — one user's data, or the entire database? Is there a public exploit or proof-of-concept?
This is where automated audits earn their keep. They can trace call paths from HTTP handlers to vulnerable functions and tell you whether a CVE is theoretical or one request away from exploitation.
Converting findings into merged fixes
The operational model that works: each finding gets an owner, a severity-based deadline, and a verification step. Critical findings (exploitable auth bypass, RCE) get fixed before the next deploy. High findings get a PR within the sprint. Medium findings enter the backlog with an expiry date.
The audit isn't done when the report is delivered. It's done when the critical findings are merged and verified. Everything else is documentation.