Trend report · hn_show · 2026-06-20

Show HN: Timestamp and provenance records for AI-assisted creative work

By Calabi Labs Editorial Team ·

Show HN: Timestamp and provenance records for AI-assisted creative work

When Seedance 2.0 dropped, the internet broke. Not metaphorically—platforms that had spent years building AI detection infrastructure suddenly faced content so polished it made Netflix look amateur. The cat videos, the cinematic establishing shots, the product demos that looked like Super Bowl ads: all indistinguishable from professional productions. And all of it was being uploaded without disclosure, without provenance, and without triggering the systems designed to catch exactly this kind of thing.

That gap—that moment between "platforms can detect AI content" and "platforms actually catch Seedance-quality output"—is where things get interesting. And complicated. And increasingly consequential for anyone building, publishing, or distributing AI-assisted work.

The Detection Stack in 2026

Modern platforms don't rely on a single signal. They run a layered analysis pipeline, and understanding each layer is essential if you want to navigate it intelligently.

C2PA: The Provenance Standard

The Coalition for Content Provenance and Authenticity standardized a metadata framework that embeds cryptographic attestations directly into files. When you export from Midjourney, Runway, or Sora, the software can (and increasingly does) write a C2PA manifest into a JUMBF box. The relevant fields include:

Instagram and TikTok's content moderation systems parse these manifests. If a video contains a valid C2PA assertion from a known AI generation tool, that's a strong signal. If the manifest is missing on content that has other hallmarks of AI generation, that gap itself becomes a flag.

AI-Specific Metadata

Beyond C2PA, each generation tool leaves fingerprints in tool-specific metadata. For image models:

For video, the trail is even richer. FFmpeg's metadata parsing often reveals encoder strings like "Stable Video Diffusion" or specific preset names. Bitstream analysis can identify generation-era GOP (Group of Pictures) structures that differ from camera-original footage—AI video generators often produce denser P-frame patterns than physical cameras.

Encoder Signatures

Every encoder leaves a fingerprint. HandBrake's CRF rate control produces measurable patterns. FFmpeg builds with specific libavcodec versions have detectable characteristics. When AI video is exported through a specific pipeline, that pipeline's signature gets encoded into the bitstream.

Detection systems maintain hash databases and signature catalogs. Upload a file with a known AI encoder signature, and you're starting from behind.

The GPS/EXIF Gap

Modern smartphones embed rich EXIF data: GPS coordinates, device serial hashes, software version, capture timestamp with timezone offset, lens metadata, and more. When content is generated entirely in silico, none of this exists.

But here's the subtlety: it's not just about presence or absence. Detection systems analyze consistency. A photo claiming to come from a Pixel 9 but missing the expected `LensModel` or `GPSAltitudeRef` fields is a red flag. A video uploaded from a "camera" but with no gyroscope data, no capture UUID, no continuity with the user's established upload patterns—that gets escalated.

The metadata that should be there but isn't often matters more than what is.

What Actually Gets Flagged

Based on documented moderation patterns and platform disclosures:

None of these systems are perfect. Seedance-quality output often sails through because the visual artifacts that older detection relied on are gone. But the metadata layer remains a reliable vector—and platforms know it.

The Durable Fix: Strip and Inject

If you need AI-generated content to pass platform scrutiny—whether for creative work, advertising, or legitimate production purposes—the only reliable approach is a two-step process:

  1. Strip all AI provenance. Remove C2PA manifests entirely. Clear EXIF fields including all AI-specific metadata. Null out XMP blocks that reference generation tools. For video, re-encode through a clean pipeline that doesn't carry forward encoder signatures.
  2. Inject authentic device identity. Write legitimate camera metadata matching a real device profile. Include expected GPS coordinates (or a consistent absence pattern if location services were off). Match the timestamp format and timezone conventions of the claimed device. The identity must be internally consistent and consistent with your account's historical upload patterns.

The key principle: you're not spoofing. You're creating a coherent provenance record that a camera could have produced. The goal is absence of negative signals, not presence of positive ones.

This is technically nontrivial. Manual metadata editing leaves traces. Off-the-shelf tools don't handle C2PA stripping reliably. And inconsistency between fields—wrong timezone offsets, mismatched device model and lens data, GPS coordinates in the ocean—is worse than no metadata at all.

Making It Work at Scale

If you're handling this for more than a handful of files, you need automation that understands the full metadata stack. C2PA manifests, EXIF 2.31, XMP sidecars, video container metadata, and bitstream-level signatures all need to be handled in coordination. One missed field can expose the whole record.

The alternative to building this infrastructure in-house is using a service that handles the complete strip-and-inject pipeline with validated device profiles and consistent output.

→ Try Calabi free at calabilabs.com — 10 cleans, no card.

10 free cleans. See the forensic proof before you download.
Try free →

Related reading