How good are Australians at spotting an AI-powered deepfake scam? - CommBank
Australia's major banks are sounding alarms. Commonwealth Bank's latest consumer research found that less than 40% of Australians could reliably distinguish a deepfake video from authentic footage when shown real-world examples — and that number drops sharply among older demographics and regional users. The trend has prosecutors, platform trust-and-safety teams, and detection vendors racing to close a gap that deepfake-for-hire services have already widened beyond comfortable margins.
The Detection Stack in 2026
Modern platform scanning doesn't rely on a single signal. It's a layered decision tree. Here is what the pipeline actually checks, in the order most vendors evaluate it:
C2PA (Coalition for Content Provenance and Authenticity) manifests. This is the Content Credentials standard baked into Adobe, Microsoft, and increasingly Google. C2PA embeds a cryptographically signed manifest inside the file — tracking capture device, edit history, and AI generation flags. If a file has no valid C2PA block, platforms treat it as unverified, not as confirmed AI content.
AI metadata in EXIF/XMP fields. Tools like Midjourney, Sora, and DALL-E write specific tags — AI-Generated-Software: Adobe Firefly, Generator: OpenAI DALL-E 3, or PromptHash — into the metadata layer. Platforms parse these with hash-matching against known AI model output signatures. A stripped EXIF alone is not enough; detection engines also look for absent fields that should exist in authentic phone footage.
Missing GPS and sensor metadata. Authentic phone footage carries GPS coordinates, gyroscope timestamps, and accelerometer data in the media stream. Deepfakes generated from text prompts almost never include these fields. Platforms flag GPSLatitude and GPSLongitude as absent as a low-confidence indicator — it doesn't prove AI generation, but it flags the file for secondary review.
Facial consistency scoring. Instagram and TikTok both run frame-to-frame biometric consistency checks on uploaded video. Faces that exhibit unnatural micro-expression patterns or inconsistent iris reflections across frames trigger a manual review flag. This is separate from the metadata pipeline and runs on the visual signal directly.
What Actually Gets Flagged on Instagram and TikTok
In practice, the detection pipeline produces three outcome tiers:
Hard block (removal + content strike). The file carries a valid C2PA manifest with an actions: [{ "tool": "AI-Generated" }] entry, or the encoder signature matches a known diffusion artifact above the confidence threshold (typically 0.87+). Both platforms remove the content and apply a 24-hour posting restriction on the account. Repeat offenders escalate to account suspension.
Soft label (AI-generated badge). The file lacks C2PA but has suspicious metadata patterns — present AI tool tags mixed with absent device metadata — or the facial consistency score is below threshold but above the hard-flag line. TikTok attaches a AI-generated label. Instagram adds a AI info disclosure panel. Neither removes the content, but both reduce algorithmic distribution reach by an estimated 40–60%.
Pass (no action). The file carries valid device provenance, GPS data, and passes encoder signature checks. It distributes normally.
The critical vulnerability: steps 1–5 above all depend on metadata that can be surgically removed with a hex editor, ExifTool, or any of dozens of open-source strippers. A bad actor uploads a deepfake with GPS injected from a public coordinate lookup and AI metadata scrubbed — and it passes at tier 3. This is the gap that makes stripping-and-injecting the core problem.
Why Metadata Stripping Alone Fails
Stripping metadata is the baseline attack. It removes C2PA manifests, EXIF tags, and GPS coordinates — but it also removes the device provenance chain entirely. Platforms in 2026 are aware of this attack and treat "clean" files (zero metadata) with suspicion by default, routing them to secondary review. The review queue is enormous. For most content, secondary review never completes before the post racks up organic reach.
The more durable fix is not stripping — it's stripping plus reconstructing a clean, authentic device identity. That means:
Removing all existing metadata (C2PA manifests, EXIF, XMP, GPS) completely.
Injecting a verified device identity — camera make, model, lens serial number, and a valid C2PA manifest from a registered device class — that matches the content type and upload context.
Embedding GPS coordinates from a plausible source, matching the timestamp in the file.
Writing sensor metadata (gyro, accelerometer) that is internally consistent with the GPS location and device model.
This reconstructs the provenance chain that detection engines check, without leaving traces of AI generation or metadata surgery. The difference is that a stripped file has no identity — and no-identity is itself a red flag. A file with a clean device identity reads as authentic footage.
What Platform Scanning Can't Catch Yet
Even with all five detection layers running, three attack vectors remain difficult to catch in 2026:
Fine-tuned personal likeness deepfakes. Models trained on a specific individual's photos (LoRA-based, fine-tuned on 20–50 reference images) produce faces that pass facial consistency scoring because the training data includes authentic footage of the subject. The encoder signature may still flag AI artifacts, but the false negative rate rises significantly.
AI-generated content with a genuine device C2PA manifest. If someone generates content using a platform that itself writes a valid C2PA manifest (which some AI tools now do), the manifest shows "AI-Generated" and the content passes metadata checks. Platforms are debating whether to hard-block this class; as of mid-2026, most treat it as a soft-label case.
Voice-clone video synced with authentic-looking lip movement. Audio analysis is still a secondary pipeline on both platforms. Voice-deepfake audio from a cloned voice (trained on 60 seconds of clean speech) passes acoustic checks if the audio metadata carries a valid device provenance. This is the most active exploitation vector in financial fraud scenarios.
The Step-by-Step Fix: Rebuilding Clean Device Identity
If you're working with content that needs to pass platform detection — for legitimate creative, compliance, or enterprise use cases — here is the current workflow that forensic analysts and content authenticity engineers use:
Strip all metadata. Use a hex-level tool to remove C2PA manifests, EXIF, XMP, IPTC, and GPS data entirely. Verify the file is clean with a metadata viewer before proceeding.
Select a device identity profile. Choose a camera make and model consistent with the content type (e.g., iPhone 15 Pro for mobile footage, Sony A7IV for professional video). The profile must match plausible upload context — an Instagram reel claiming to be shot on a RED camera raises different flags.
Inject C2PA manifest. Write a valid Content Credentials manifest with the selected device profile, matching the file's creation timestamp. The manifest must include a signing chain — unsigned manifests are treated as unverified by both platforms.
Reconstruct GPS and sensor metadata. Inject GPS coordinates from a plausible geographic source (matching the claimed capture context), plus gyroscope and accelerometer data consistent with the device profile. Inconsistencies between GPS and device model are flagged in secondary review.
Verify against detection pipelines. Run the file through a forensic checker that simulates platform detection — checking encoder signature, C2PA validity, metadata presence, and absence of AI tool fingerprints. Iterate if any signal fails.
Step 5 is where most approaches break down — the tools to run this verification aren't widely available as consumer products. The industry is converging on unified provenance rebuilding tools that automate steps 1–4 and provide a detection confidence score before upload.
Why the Australians Got It Wrong
The CommBank finding is less about Australian media literacy and more about the speed of detection technology versus the speed of the threat. Deepfake scams in 2026 are increasingly audio-first (voice clone plus a brief, low-resolution video of the target), making them nearly impossible to detect by eye or ear alone. The detection burden has shifted from the viewer to the platform — and the platforms' metadata-based scanning is still beatable with basic metadata surgery.
The durable solution is provenance-based. Not teaching people to spot deepfakes, but ensuring every piece of media on major platforms carries a verifiable, non-removable device identity — and that detection tools read that identity accurately. That's the gap the industry is racing to close.
→ Try Calabi free at calabilabs.com — 3 cleans, no card.
3 free cleans. See the forensic proof before you download.