What an AI Image Detector Actually Checks — And Why Your File Still Gets Flagged
Every AI image detector — whether it's built into Instagram, TikTok, Reddit, or a standalone tool like Hive Moderation — scans the same hidden layer: your file's metadata and binary fingerprint. It's not looking at pixels first. It's reading what software made the image, whether cryptographic provenance certificates are attached, and whether the file's technical structure matches a real phone capture. That's what actually gets you flagged, and it's exactly what most "fixes" miss entirely.
What Actually Gets Flagged: The Metadata Layer AI Detectors Read
Platform AI detectors don't rely on blurry hands or weird teeth. They scan metadata fields that are invisible in normal viewing. Here are the specific signals that trigger a detection:
C2PA / Content Credentials: Adobe, Microsoft, Google, and others co-developed the C2PA standard, which embeds JUMBF (JPEG Universal Metadata Box) manifests directly into image files. These manifests contain a cryptographic "made by AI" declaration — an actionable digital provenance chain. A Sora export or Midjourney v6 image carries multiple JUMBF atoms with C2PA assertions. Platforms like Reddit now parse these on upload.
XMP AI flags — DigitalSourceType:trainedAlgorithmicMedia: This specific XMP property, part of the IPTC Photo Metadata Standard 2025.1 update, explicitly marks an image as generated from a trained algorithmic model. It's a plain-text metadata tag, and it's one of the clearest AI signals a detector can read.
Generator and tool tags: Fields like Creator Tool, Software, or Generator in EXIF/XMP headers routinely say "Midjourney," "DALL-E 3," "Flux.1 Pro," or "Stable Diffusion." These are not hidden — they're in standard metadata headers that ExifTool exposes in seconds.
Encoder fingerprints — Lavc, x264 SEI: Video exports from AI tools carry specific encoder signatures. Lavc (FFmpeg's libavcodec encoder) and x264 SEI (Supplemental Enhancement Information) messages are embedded by the encoding pipeline and are consistent markers of AI-generated video processed through specific software stacks.
Missing capture context: A real phone photo has GPS coordinates, a capture timestamp, a device Make/Model, and software version. AI exports typically lack all of these. The absence of these fields is itself a detection signal.
AI detector tools like Hive Moderation (94% accuracy across Midjourney, DALL-E 3, and Stable Diffusion), TrueScreen, and C2PA verification endpoints don't just look at pixels — they read the file's structural metadata and compare it against known AI-generation fingerprints.
Why the Obvious Fixes Still Get You Flagged
Most creators try one of three approaches after their first flag, and all three fail at the metadata layer:
Cropping the image: This removes visible elements — a corner logo, a sparkle watermark — but metadata is stored at the file level, not tied to specific pixel coordinates. C2PA JUMBF manifests, XMP AI flags, and encoder fingerprints survive cropping completely. The platform still sees the same file structure underneath.
Screenshotting and re-saving: Taking a screenshot forces a new encode through your operating system's screen-capture pipeline. While this strips some metadata, it often introduces new signals — a different DPI, a macOS/Windows encoder fingerprint, no GPS or camera Make/Model — that look equally "not a real photo." It doesn't remove C2PA manifests if they were embedded at the file level, and it degrades quality significantly.
Re-uploading from a different app: Re-saving through Photoshop, Preview, or a social media uploader removes some metadata but doesn't target the specific AI-signaling fields. The XMP DigitalSourceType tag, C2PA atoms, and encoder fingerprints aren't touched by a standard "Save for Web" operation.
None of these approaches strip C2PA manifests, remove the trainedAlgorithmicMedia XMP flag, or inject authentic device identity. That's why the flag comes back.
How to Actually Clean an AI-Generated File: Strip, Inject, Verify
Real cleaning means rewriting the file's metadata identity at the structural level — not just hiding it. Calabi runs a one-pass pipeline that does exactly this:
Upload your AI-generated image or video. Drag and drop — no settings, no manual options. The pipeline starts automatically.
Calabi strips the detection signals. It removes all C2PA / JUMBF Content Credentials manifests, reduces C2PA references to zero, strips the DigitalSourceType:trainedAlgorithmicMedia XMP flag, clears generator/tool metadata fields, and removes encoder fingerprints like Lavc and x264 SEI from video files. A raw AI export that carries 144 metadata tags gets reduced to about 94 neutral structural tags.
Calabi injects authentic phone-capture identity. It writes a real device profile — iPhone 15 Pro, Pixel 8 Pro, or Galaxy S24 Ultra — with Make, Model, Software version, GPS coordinates, and a capture timestamp. The file now structurally matches one taken on an actual phone.
Review the forensic proof card before downloading. Calabi returns the same ExifTool scan that platforms use, showing exactly what was stripped and what was injected. You see the before-and-after state of every field — C2PA atoms, XMP flags, GPS, timestamp, device identity.
Download the cleaned file. Ready to upload. Results vary by platform and source model, as with any metadata tool.
This is fundamentally different from editing pixels. No region is selected, painted, filled, or reconstructed. The image looks identical — but the file-level identity has been rewritten from "AI export" to "phone capture."
FAQ: Real Questions About AI Image Detection
Can I just remove metadata manually in ExifTool? You can strip metadata with ExifTool commands, but it's easy to miss fields — C2PA manifests stored as JUMBF atoms, specific XMP AI flags, encoder SEI data in video. You also won't inject a realistic device identity with GPS and timestamp in the right format. Calabi handles the full surface area in one pass and returns a proof card so you know exactly what's changed.
Do platforms use AI image detectors that look at pixels, not metadata? Some do — Hive Moderation and similar tools use perceptual hash analysis and neural classifiers on pixel data. These catch files that have been heavily re-encoded. Metadata cleaning handles the automated first-pass scans platforms run on every upload before pixel-level analysis. For best results, use both approaches: Calabi handles metadata, and a re-encode (through a platform's own upload pipeline) handles pixel-level fingerprints. Neither alone catches everything.
What about visible watermarks, like Sora's sparkle or a logo in the corner? Calabi doesn't edit pixels, so it doesn't remove visible marks. A tight crop removes the visible watermark — it won't be in the frame anymore. The honest point: the invisible detection layer (C2PA, XMP flags, encoder fingerprints) is what survives cropping and what platforms actually scan for on re-upload. Calabi removes that layer. The visible mark and the invisible metadata are separate problems.