Product Design Evaluation

LLM Anomaly Detection & Audit Governance Platform

Embedded LLM-powered anomaly detection into risk and compliance workflows at Bank of America — cutting compliance cycles by 40% and reducing manual audit work by 60%.

Role Senior Technical Product Manager · Bank of America
Status Completed
Year 2023–2025
LLMs Anomaly Detection Compliance Risk Systems
40% Faster compliance cycles
60% Manual audit work reduced
15% Improved audit readiness & trade profitability

Problem

Compliance and audit teams were spending significant time on manual review of trading activity, position reports, and reconciliation outputs — looking for anomalies that could signal errors, policy violations, or regulatory exposure. The volume of data had outgrown the capacity of manual review, and compliance cycles were too slow to catch issues in time to act.

The core tension

Compliance requirements demanded thorough review. Data volumes made thorough manual review impossible. Something had to give — either review depth or review speed. The answer was intelligent automation.

Opportunity

Embed LLM-powered anomaly detection directly into the compliance and audit workflow — not as a separate tool, but as an intelligent layer that surfaces what matters and reduces the review burden on human compliance teams, while maintaining a full audit trail.

Design Decisions

LLM as a triage layer, not a decision-maker

The system was positioned as an intelligent triage tool — it flags, ranks, and explains anomalies for human review, but does not make compliance decisions autonomously. This framing was critical for regulatory acceptance: the LLM augments the compliance team, it doesn't replace their judgment.

Audit governance built into the architecture

Every LLM output — the anomaly flagged, the reasoning provided, the confidence score — is logged, versioned, and linked to the human action taken. This made the system fully auditable: regulators could see not just what was flagged, but why, and how the human reviewer responded. Governance wasn't a feature added later; it was a core design constraint from day one.

Domain-specific prompting and calibration

Generic LLM behavior on financial compliance data produced too many false positives and missed domain-specific patterns. Significant investment went into prompt engineering, calibration datasets, and threshold tuning — work that wasn't glamorous but made the difference between a tool that helped and one that created noise.

Trade-offs

What we gained

  • 40% reduction in compliance cycle time
  • 60% reduction in manual audit work
  • Full auditability of AI-assisted decisions
  • Earlier anomaly detection before escalation

What we gave up

  • Significant upfront calibration investment
  • Ongoing model monitoring and drift detection
  • Regulatory acceptance required sustained explanation

Opportunity Cost Evaluation

Expanding the compliance team headcount was the alternative. But linear headcount growth can't keep pace with exponential data growth. The LLM approach created a capability that scaled with data volume rather than head count — a fundamentally different cost curve that compounded in value over time.

The key insight

Compliance isn't about reviewing everything — it's about reviewing the right things with the right depth. LLM triage makes that possible at scale. Human judgment is preserved where it matters; automation handles the volume.

Success Metrics

What's Next