AML and Fraud Detection: Machine Learning Models for Risk Scoring in iGaming

At 02:17 a.m., deposits spiked. The rule engine yawned. The ML model gave a medium score. A human saw one odd clue: five new accounts, one device hash, same payout target. The queue lit up. We stopped the ring in seven minutes. This piece is about that gap between a score and a save, and how to build a system that closes it.

The uncomfortable truth about AML in iGaming

iGaming runs fast. Faster than most banks. Promo waves, bonus hunts, late-night play, and cash-out rushes make the signal messy. Fraud teams face mule networks, device farms, and chipped IDs. Compliance teams face strict rules, short clocks, and cross-border checks. Growth wants smooth onboarding. Control wants proof and pause. Tension lives in every sprint.

Threats also move with wider cyber crime. For context and patterns that spill into gaming, read Europol’s Internet Organised Crime Threat Assessment. It shows how groups reuse playbooks across sectors, and why device, payment, and network views must work as one.

Europol’s Internet Organised Crime Threat Assessment

Regulatory snaplines you can’t ignore

You cannot design risk scoring in a vacuum. Start from the law and work back to the model. For casinos, the global base is the FATF risk‑based approach for casinos. It sets the idea of risk tiers, ongoing monitoring, and enhanced due diligence for higher risk. You must show how your score feeds these parts.

In the U.S., card clubs and casino operators follow FinCEN guidance for casinos and card clubs. Note how it links transaction monitoring with customer due diligence and SAR filing. In the UK, the UK Gambling Commission AML guidance goes deep on customer risk, triggers, and staff training.

Remote-first brands often sit in Malta. There, the FIAU has sector rules. See the sectoral implementing procedures for remote gaming. These add clarity on source of funds, events that raise risk, and record keeping. Build your features and workflow with those “snaplines” in mind. Your score should map to each duty: screen, monitor, escalate, document.

Data bedrock: what a useful risk signal is made of

Strong risk scoring needs layers. Use payment data (amounts, methods, chargebacks, payout velocity). Use game events (session time, bet spread, bonus usage). Use device and network (fingerprint, IP, ASN, proxy score, time zone drift). Use KYC (ID type, doc age, address match, country). Add sanctions and PEP checks. Combine login patterns, failed auth, and device churn. You want signals that change when risk is real, not just when a user is new.

For risk factors and when to do EDD, the EBA ML/TF risk factor guidelines are a good map. And if you use e‑ID or selfie checks, align with FATF digital identity guidance. Keep data lean: collect what you need, secure it, and expire what you do not need.

A taxonomy of risk scoring in iGaming

Think in layers, not one giant score.

Player-level: base AML risk from KYC, country, funding mix, play style.
Transaction-level: deposit and payout checks in real time.
Device-level: fingerprint risk and link to other accounts.
Network-level: graph links across IPs, payment cards, emails, and payout targets.

Rules still matter. “Over 3 new cards in 24 hours” or “proxy on first deposit” can stop obvious abuse. But rules alone grow into spaghetti. Use ML where patterns shift or combine. Use rules for guardrails and legal hard stops.

Mini‑case: We saw a ring that did fast small bets, then cash‑out. Device hash was stable, but IPs rotated. A simple feature—payout velocity per device—fed to a gradient boosting model raised the score above the review cut. The team linked the device to four prior accounts and blocked the flow.

The model shelf: what actually ships

You do not need every model. Pick for need and speed.

Rules and heuristics: instant, clear, easy to audit. Good for known bads and legal blocks.
Gradient boosting (XGBoost, LightGBM): strong on tabular data. Great mix of power and control. Good for payout fraud, bonus abuse, and layered risk.
Anomaly detection (Isolation Forest, autoencoders): good for “we have not seen this pattern”. Use as a second opinion to catch novel shapes. Expect noise.
Graph ML/GNN: best for rings, mules, and shared assets. Needs solid entity link rules and clean edges.
PU‑learning/weak supervision: helpful when labels are rare or delayed. Can pull value from investigator notes and soft signals.

Design for outcomes, not only AUC. The Wolfsberg paper on AML effectiveness is a short, clear read on this. It pushes teams to show impact: quality of alerts, speed to file, and real risk removed.

What we chose not to do

We do not ship raw deep nets on clickstreams without features. They look clever, but cost to explain is high, drift is sharp, and gains fade. We prefer a strong GBDT with hand‑built features and a thin graph layer over a black box with pretty charts.

Ground truth is slippery: labels, loops, and bias

Labels are not perfect. A filed SAR is not always crime, and no SAR is not always clean. Law cases take time, so you train on a fog of “likely” and “maybe”. Accept this. Reduce the lag with fast investigator feedback, weekly label drops, and backtests.

Bias creeps in. If you block more users from one country, you may train the model to hit that flag harder next time. To counter this, do periodic blind reviews, add fairness checks by country and payment rail, and tune thresholds by use case. For typologies and how crime groups move money, see INTERPOL’s overview of money laundering typologies. It helps you build labels and rules that match real world tricks.

Real-time architecture, explained in one breath

Stream in events. Join them in a stream processor. Read features from an online feature store. Score with a model service. Return a response in under 100 ms for payments, or near real time for game events. Store the explain output with the alert. Hand off the case to a queue with context and a clear next step. Sync an offline store for training and tests.

Plan for spikes and failure. Double‑write critical features. Use fallbacks: if the model is down, drop to a safe rule set. Log inputs and outputs for audit. Align these parts with a risk framework like the NIST AI Risk Management Framework. It keeps design, testing, and review tight.

Edge cases log

Tournament nights cause real spikes. Adjust thresholds or damp velocity features during known events.
Big promo drops skew behavior. Tag promo cohorts in features to avoid false spikes.
Time‑outs and session caps can look like break‑and‑return fraud. Track them as first‑class events.

Measuring what matters: beyond AUC

AUC is fine, but ops care about cost and time. Track precision at top K alerts. Track hours saved by auto‑clear. Track time to SAR. Track payout delay time and customer harm avoided. Watch alert acceptance rate by investigator. Fold in expected cost: false blocks hurt revenue and trust; misses hurt legal risk.

Monitor drift in features and score. Backtest each month on a fixed slice. Keep a fairness view by country, payment type, and device risk. When you use AI to auto‑explain alerts, stay close to privacy and duty to explain rules. The UK ICO has plain advice on this topic: explaining AI decisions.

Governance that doesn’t get in your way

Write model cards. Version your features. Keep a change log. Store training data slices. Keep a list of known risks, like proxy bias or over‑weighting device churn. Give investigators a reason code for each alert and a short summary of top drivers (for example, SHAP top 3 features).

Your control partners need proof. Link your process to the FCA Financial Crime Guide if you serve UK users. Keep an override journal and do regular retros with compliance. Show that risk scoring helps policy, not hides it.

Sanctions, PEP, and the art of negative news

Sanctions and PEP checks sit next to AML. They must feed and raise your risk score. Re‑screen on a set cycle and on key events, like a new payout method or a big deposit. For lists, many teams source from OFAC; see the OFAC SDN list. Add negative news scans for high‑risk cases. Tie hits to EDD, holds, or reports based on policy.

Where players fit in (and why transparency pays)

Players do not love waits or docs. But clear steps and short texts help a lot. Tell users when and why you need checks. Share payout SLAs. Give status in the cashier view. This lowers tickets and heat. If you are a player picking where to deposit, look for brands that publish AML and KYC rules and keep clear payout terms. Independent review hubs like Casinosikten official site keep lists of licensed brands and call out slow or weak checks. That helps users choose safer places to play.

A contrarian take: fewer models, better outcomes

Keep your stack small. One well‑tuned GBDT with good features, plus a light graph layer, plus a few hard rules, will beat a zoo of half‑baked models. It is easier to monitor, explain, and fix. Investigators see stable signals. Compliance can audit it. And you can ship changes fast.

Implementation traps to dodge

Feature leakage: do not use post‑payout facts to score pre‑payout actions.
Jurisdiction drift: rules differ by country. Add geo guards to features and thresholds.
Promo drift: features tied to promos will swing. Tag promos and add seasonality to training.
No tests: write unit tests for feature code and schema checks for payloads.
Label lag: plan weekly label refresh and add human feedback loops.

The table you’ll want to screenshot

Comparative trade‑offs of AML/Fraud ML approaches in iGaming

Rules / Heuristics	Legal blocks, clear patterns, sanctions hits, device bans	Sub‑100 ms	High (if‑then, clear triggers)	Basic KYC, payments, device flags	High precision at low K; volume can spike if broad	Low to medium (policy reviews)	Low drift, but logic creep risk
GBDT (XGBoost/LightGBM)	Payout fraud, bonus abuse, layered AML risk	Sub‑100 ms with light features	Medium (feature importance, SHAP)	Rich tabular features; clean labels help	Strong precision; balanced volume with tuned thresholds	Medium (docs, backtests, explain)	Medium drift; retrain monthly/quarterly
Anomaly Detection	Novel spikes, odd device or velocity shapes	Near real‑time or batch	Low to medium (scores, few reasons)	Unlabeled streams; robust scaling	Catches rare cases; can raise noisy volume	Medium (tuning, whitelist flow)	High drift; needs tight monitoring
Graph ML / GNN	Rings, mule chains, shared payout targets	Near real‑time with pre‑built graph	Medium (path/neighbor reasons)	Entity links, edges (IP, device, cards, emails)	High precision on ring members; lower volume	High (entity resolution, lineage)	Medium; graph refresh cadence matters
PU‑Learning / Weak Supervision	Few labels, delayed SAR outcomes	Batch or near real‑time	Low to medium (rule/label mix)	Soft labels from notes, rules, heuristics	Can lift recall at same volume	Medium (source tracking)	Medium; sensitive to noisy labels

Note: Trade‑offs reflect field work and open guidance (for example, the FATF and Wolfsberg sources linked above). Your numbers will vary by brand, market, and data quality.

FAQ

How fast should real‑time scoring be?

For payments, aim under 100 ms end‑to‑end. For game events, near real time is fine. Always plan a safe fallback.

What metrics should I report to compliance?

Alert quality, time to SAR, case aging, and fairness by country and payment rail. Keep monthly backtests.

Do I need graph ML from day one?

No. Start with entity linking and graph features. Add ML when you trust the graph and see rings.

How do I cut false positives?

Use precision@K, tune thresholds by use case, and add simple allow rules for clear good patterns. Share reasons with players to lower friction.

Credits, sources, and “call me if it broke in prod”

Author: Alex Morozov, Head of Financial Crime Analytics. 10+ years in risk for payments and iGaming. ACAMS certified. Built and ran AML/fraud stacks across EU and North America.

Global: FATF risk‑based approach for casinos
US: FinCEN guidance
UK: UKGC AML guidance; FCA Financial Crime Guide
EU: EBA risk factor guidelines; FIAU Malta SIPs
Sanctions: OFAC SDN list
Governance: NIST AI RMF; ICO guidance on explaining AI
Effectiveness: Wolfsberg on AML effectiveness
Typologies: INTERPOL money laundering typologies

Reviewed by: Maria Jensen, Compliance Officer. Last updated: 2026‑05‑22.

Disclaimer: This article is for information. It is not legal advice. Check local rules for your markets.