Calabi Labs · Guide · 2026-06-15
```html
Uploading an AI-generated video to YouTube and seeing it auto-labeled "AI-generated" is the fastest way to kill reach, credibility, and ad revenue — regardless of how polished the output looks. YouTube now reads the invisible metadata layer inside every file you upload: C2PA Content Credentials, XMP tags, and encoder fingerprints. If those signals say "machine-made," YouTube applies a visible label, often within seconds of upload. The solution isn't rendering at higher resolution or tweaking your export settings — it's removing the forensic trail that platforms use to detect AI content in the first place.
YouTube's automatic detection system doesn't watch your video the way a human does. It reads the metadata — the invisible data embedded in the file structure — and cross-references it against known AI-generation fingerprints. There are four main signal categories that trigger an AI label:
The Coalition for Content Provenance and Authenticity embeds cryptographic manifests called JUMBF atoms inside AI-generated files. These atoms act as a tamper-evident "made by AI" manifest. If your file contains C2PA metadata indicating it was fully generative, YouTube reads that flag and applies a permanent label — even if you never disclosed it yourself. The May 2026 update made these labels more prominent and extended them to any video carrying verified C2PA atoms.
Tools like Midjourney, Sora, Runway, and Pika write specific XMP fields into exported files. The mostdamaging is DigitalSourceType: trainedAlgorithmicMedia — a CIPA-standardized tag that explicitly signals the file came from an AI model trained on scraped data. A raw AI export can carry 144 metadata tags. One of them is enough to flag the whole file.
AI video models don't use the same encoders as phone cameras. Open-source encoders like Lavc (FFmpeg's libavcodec), x264, and x265 write SEI (Supplemental Enhancement Information) messages into the video bitstream. These messages are readable at the bitstream level without even parsing metadata. YouTube's detection layer reads this. A file encoded with Lavc + x264 SEI is a near-certain flag.
Authentic phone recordings carry a specific identity profile: Make (Apple/Google/Samsung), Model (iPhone 15 Pro, Pixel 8 Pro), Software version, GPS coordinates, and a capture timestamp in EXIF. AI exports have none of these. Platforms treat the absence of these fields as a weak AI signal — and when combined with positive C2PA or XMP flags, the detection is nearly certain.
Cropping or trimming removes the visible frame — a visible watermark in the corner, for example — but it does nothing to the metadata layer. The C2PA atoms, XMP tags, and encoder fingerprints are embedded at the file level, not the pixel level. Cropping a 16:9 export to 9:16 for Shorts still leaves every forensic flag intact.
Screenshotting or re-recording your monitor removes some metadata but introduces new problems: it downscales your video to screen resolution, adds display artifacts, and can still carry the original file's metadata in the PNG/JPEG header if the screen capture tool embeds it. You're trading one detection risk for another and destroying quality in the process.
Re-exporting through a video editor strips some metadata, but if the editor doesn't specifically target C2PA atoms, XMP AI tags, and bitstream SEI messages, the core signals survive. Most consumer editors (DaVinci Resolve, Premiere, CapCut) strip visible metadata but leave C2PA and Lavc fingerprints completely untouched.
Calabi runs a one-pass pipeline that strips every detection signal and injects authentic phone-capture identity, so the file reads exactly like a phone recording at the forensic level. Here's how it works:
DigitalSourceType: trainedAlgorithmicMedia XMP tag, and clears Lavc / x264 SEI encoder fingerprints from the bitstream. A raw AI export's 144 metadata tags get reduced to roughly 94 neutral structural tags.No photo editing, no inpainting, no pixel-level work. Everything happens at the metadata and bitstream level — the layer YouTube actually scans.
Does re-encoding through HandBrake remove AI detection signals?
Partially. HandBrake and FFmpeg can strip some metadata and re-encode the video stream, which removes Lavc SEI messages. However, they don't specifically target C2PA atoms or XMP trainedAlgorithmicMedia flags, and they don't inject phone-capture identity. You'd need to manually reconstruct EXIF data with a tool like ExifTool — and mismatched or templated data is itself a detection signal.
Will Calabi work on Sora, Runway, Kling, or Pika exports?
Yes. All major AI video generators write C2PA metadata and use Lavc/x264 encoding by default. Calabi targets those specific signals regardless of which tool generated the video. Visible watermarks from those platforms — like Sora's sparkle or Runway's corner logo — aren't removed by Calabi (cropping handles those), but the invisible detection layer that survives cropping gets fully stripped.
Can YouTube still detect my video if I've removed all metadata?
Removing metadata and encoder fingerprints eliminates the primary detection path YouTube uses. Results vary based on the specific source model and how the file was generated. Calabi reduces the signals from near-certain detection to no automated signal present — but no tool can guarantee a platform's behavior, as detection methods evolve.
Try Calabi free at calabilabs.com — 10 cleans, no card.
```