Trend report · gnews_detection · 2026-06-15
When Assam Police started investigating how explicit AI-manipulated content spread through Guwahati schools, the question wasn't just about who made it—it was about how it moved through platforms undetected for any length of time. The answer lives in the metadata layer most creators never see.
Platforms like Instagram, TikTok, YouTube, and Reddit don't only scan what an image or video looks like. They scan what's baked into the file itself—the invisible scaffolding that tells a server exactly where it came from and how it was made.
In 2026, that scaffolding includes several distinct layers. The first is C2PA / Content Credentials, a cryptographic manifest stored as JUMBF atoms that says, explicitly, "this was generated by an AI model." If you've ever exported from Midjourney, Runway, Sora, or Kling, your file carries this manifest. A single AI video can contain 18 JUMBF atoms and 16 C2PA references—all pointing back to machine origin. Platforms read this like a barcode.
The second layer is XMP metadata. Fields like DigitalSourceType: trainedAlgorithmicMedia flag the file at the document level. Generator tags, software names, and tool attribution get written into EXIF and XMP streams. A raw AI export can carry 144 metadata tags—far more than any authentic phone recording, which typically carries around 94 neutral structural tags.
The third layer is encoder fingerprints. Video files generated by AI tools carry distinctive bitstream signatures—Lavc (FFmpeg's libavcodec) and x264 SEI (Supplemental Enhancement Information) messages that are dead giveaways. Real phone recordings use hardware encoders: iPhone's HEVC, Pixel's VP9, or Samsung's corresponding silicon. The difference is audible to forensic scanners.
Finally, there's the absence problem. Authentic phone captures carry GPS coordinates, capture timestamps synced to the device clock, and a consistent Make/Model/Software chain. AI exports typically lack all three. That missing data is itself a signal.
Both platforms run automated scans before content goes live. Instagram's detection pipeline checks for C2PA Content Credentials first—if that manifest says "AI-generated," the post gets routed for review or rejection, even if the content itself looks completely normal. TikTok runs similar checks alongside perceptual hash comparisons against known AI-generated libraries.
But here's what most creators miss: cropping doesn't help. You can trim a Sora export to 720p and strip visible UI, but the JUMBF manifest survives. The cryptographic proof is embedded at the file level, not burned into pixels. A corner logo or visible sparkle can be cropped out—but the C2PA atoms, the XMP flags, the Lavc encoder fingerprint, and the missing GPS data remain. That's what actually gets you flagged.
In the Guwahati case, the explicit content circulated widely on messaging apps and likely reached social platforms. The reason some of it spread further than expected wasn't just virality—it was that some versions had been through enough recompression cycles to partially disrupt metadata. Others hadn't, and those versions hit walls on upload.
Calabi runs a three-stage pipeline on every upload, no manual input required.
Stage 1: Strip. The tool removes every detection signal in one pass: all JUMBF / C2PA atoms (18 atoms → 0), all C2PA references (16 → 0), the DigitalSourceType: trainedAlgorithmicMedia flag, generator and tool tags, and encoder fingerprints like Lavc and x264 SEI messages. The result is a clean file with no AI origin story attached.
Stage 2: Inject. Calabi writes authentic phone-capture identity into the metadata: a real device profile (iPhone 15 Pro, Pixel 8 Pro, Galaxy S24 Ultra), corresponding software version, GPS coordinates, and a capture timestamp. It also inserts the encoder signature of that device's hardware video encoder. The file now reads, at the forensic level, exactly like something recorded on a phone and uploaded directly.
Stage 3: Verify. Before download, Calabi generates a forensic proof card—the same ExifTool scan that platforms and newsrooms use. You see exactly what was stripped (the AI signals) and what was injected (the phone identity). This is the same tool journalists and researchers use to verify origin, which means you can trust what the proof card says about your file.
→ Try Calabi free at calabilabs.com — 10 cleans, no card.