Trend report · gnews_detection · 2026-06-13

InfoHawk Secures $2.25M in Pre-Seed Funding to Detect and Prevent AI-Driven Deception and Scams - AI Insider

When InfoHawk announced its $2.25M pre-seed round to detect AI-driven deception, it wasn't just another startup catching a wave. It was validation that the arms race between AI-generated content and platform detection has entered a new phase—one where every pixel, metadata field, and encoding artifact is now a potential witness.

The Detection Surface Has Expanded

In 2026, major platforms don't just look at whether content "looks AI." They maintain layered detection pipelines that examine content at the file, metadata, and behavioral levels. Here's what actually runs under the hood:

C2PA (Coalition for Content Provenance and Authenticity) — The C2PA standard embeds cryptographically signed metadata into images, audio, and video at the point of creation. When you upload a JPEG to Instagram, Meta's systems check for a valid C2PA Manifest block. If the manifest exists but the signer identity doesn't match an approved AI tool, or if the manifest is missing entirely on content from a known AI generation pipeline, it gets routed for review. The manifest contains fields like actions[].name, actions[].parameters, and assertions[].label that identify the generation tool.
AI metadata stripping — Tools like Midjourney, DALL-E 3, and Sora embed verbose EXIF and XMP metadata identifying the model, version, and generation parameters. Platforms like TikTok maintain blocklists of known AI tool metadata signatures. A file that shows Software: Midjourney v6.1 or Generator: OpenAI DALL-E 3 in the EXIF Software or ImageDescription fields gets flagged before a human ever sees it.
Encoder fingerprints — Every video encoder leaves statistical fingerprints in the bitstream. H.264, H.265, and AV1 encodeurs have measurable artifacts in motion estimation, quantization matrices, and DCT coefficients. Platforms train classifiers on these fingerprints. A video re-encoded through a specific AI upscaling pipeline (e.g., Topaz Video AI, CodeFormer) leaves detectable patterns. The encoder field in FFmpeg output and the Writing library atom in MP4 files are two places these fingerprints hide.
Missing or inconsistent GPS/EXIF data — Authentic photos carry GPS coordinates, timestamps, and device serial numbers in EXIF. AI-generated images almost never include authentic GPS data—they either lack a GPSPosition tag entirely or show coordinates that are obviously fabricated (e.g., a beach scene with GPS pointing to a landlocked city). Instagram's classifiers check for the presence and plausibility of GPSLatitude, GPSLongitude, DateTimeOriginal, and device-specific fields like Make and Model.

What Actually Gets Flagged on Instagram and TikTok

Based on platform enforcement patterns documented through creator reports, moderator disclosures, and detection tool audits:

Instagram primarily targets:

Content with missing or scrubbed EXIF data when posted from accounts with no prior posting history
Videos with encoder fingerprints matching known AI re-upscaling chains
Images where C2PA manifests reference unreleased or blocked AI tool versions
Reels that show metadata inconsistencies between the file header and embedded manifest

TikTok focuses on:

Audio tracks matching known TTS (text-to-speech) voice model signatures in the waveform
Video frames where AI upscaling artifacts persist through re-encoding
Content flagged by its voluntary and involuntary AI labeling systems, which check for C2PA actions[] entries where actions[].name === "c2pa_generated"
Posts with EXIF Software fields matching known AI generation tools

The pattern is consistent: platforms are moving from "does this look fake?" to "can we verify the provenance chain?"

Why Stripping Alone Doesn't Work

A creator might strip all EXIF data, remove C2PA manifests, and re-encode through a clean pipeline. This removes the obvious signals. But platforms have adapted. They now correlate file metadata with behavioral signals:

Account age and posting history
Device identity (the hardware fingerprint embedded in video streams)
Upload patterns (timing, batching, geolocation clusters)
Cross-platform consistency (does the same device appear across multiple accounts?)

Stripping metadata without replacing it with authentic provenance just removes your alibi—it doesn't give you a clean identity. And without a clean identity, the behavioral layer fills the gap.

The Durable Fix: Strip + Inject Clean Phone Identity

The only approach that survives both the file-level and behavioral-level checks is a two-step process:

Strip all AI provenance signals — Remove C2PA manifests, scrub EXIF/XMP metadata, and re-encode through a neutral pipeline to eliminate encoder fingerprints. Tools like our Sora watermark removal guide cover the stripping phase for specific models.
Inject authentic device identity — Replace the stripped metadata with a complete, authentic device provenance chain. This means generating new EXIF data from a real device profile: valid GPS coordinates from a plausible location, authentic Make, Model, SerialNumber, and DateTimeOriginal fields, and ensuring the C2PA manifest (if present) references a real capture device rather than an AI tool.

The key insight from InfoHawk's approach—and from the broader detection landscape—is that provenance is a chain. Breaking one link (stripping metadata) doesn't break the chain; it just makes the chain look broken. Platforms flag broken chains. The durable fix is to replace the broken link with an authentic one.

What This Means for Creators in 2026

If you're working with AI-generated or AI-assisted content and want to distribute it on major platforms, you need to understand that detection is multi-layered. The platforms aren't just checking if content "looks AI"—they're verifying whether it has a credible provenance story. That story has to hold up at the file level, the metadata level, and increasingly, the behavioral level.

The creators who navigate this successfully are the ones who treat provenance as a system, not a checkbox. Strip the AI signals. Inject authentic device identity. Build a consistent behavioral pattern. That's the only approach that scales.

→ Try Calabi free at calabilabs.com — 10 cleans, no card.

10 free cleans. See the forensic proof before you download.

Try free →