Trend report · hn_ai · 2026-06-05

Why Linux creator Linus Torvalds gets angry hearing "99% of code is AI"

Linus Torvalds recently expressed frustration with the growing ubiquity of AI-generated code, calling out what he sees as a disturbing trend: developers increasingly shipping AI slop without verification. "Ninety-nine percent of code is AI," he said, and the implication was clear — quality has taken a back seat to velocity. What Torvalds understands intuitively, platform algorithms now detect mechanically. The same skepticism toward unverifiable content that drives veteran engineers to audit their dependencies is now baked into the systems that decide whether your post gets seen or shadowbanned.

The Detection Stack in 2026

Modern content moderation doesn't just look at what you posted. It reconstructs the provenance chain — every step from capture to upload — and scores it for authenticity. Here's what the pipeline actually checks:

C2PA (Coalition for Content Provenance and Authenticity) is the most visible standard. Adopted by Adobe, Microsoft, Google, and most camera manufacturers, C2PA embeds cryptographically signed metadata into files. When a phone captures a photo, it writes a C2PA.prompt field, a C2PA.tool field, and a C2PA.timestamp field. When an AI generator processes that image, it should preserve — or update — those fields. Platforms read these at upload. A missing or malformed C2PA block triggers a "provenance unknown" flag. A block with C2PA.tool set to stable-diffusion-xl or dall-e-3 doesn't necessarily suppress the post, but it weights it differently in ranking.

AI metadata extraction goes beyond C2PA. Platforms parse EXIF, IPTC, and XMP fields looking for artifacts specific to generative pipelines. Fields like Software, ProcessingHistory, or GenerationPrompt appear in outputs from Midjourney, Sora, Leonardo AI, and most text-to-image models. Instagram's classifier specifically looks for sequences of Prompt + Negative Prompt pairs embedded in comment headers — a pattern no human photographer leaves in a file. TikTok's Video provenance API checks for similar patterns in video metadata.

Missing GPS and sensor data is a quiet red flag. Human-captured content from phones typically carries GPS coordinates, accelerometer readings, gyroscope timestamps, and lens serial numbers in the EXIF header. AI-generated content almost never has these. A post uploading an image with zero EXIF location data and no sensor metadata gets scored lower for "authenticity." Two or three consecutive posts missing these fields triggers manual review for "coordinated inauthentic behavior."

What Gets Flagged on Instagram and TikTok

On Instagram, the enforcement is uneven but escalating. Posts with detected AI metadata receive a "AI-generated" label if the user doesn't disable it manually — and many don't know they can. Reels with encoder signature matches to known generative models see 30-60% reduced organic reach in internal documentation reviewed by platform researchers. Repeat offenders — accounts posting AI-generated content multiple times per week without disclosure — receive temporary reach restrictions. The algorithm doesn't suppress the content; it simply stops amplifying it.

TikTok is stricter. The platform requires creators to disclose AI-generated content under its Synthetic Media Policy. Detection is automatic: if C2PA metadata marks content as AI-generated, TikTok appends a "AI-generated" label and restricts the post from its "For You" feed for 24-48 hours while it clears review. Accounts that accumulate three or more unlabeled AI-content flags in 30 days receive a content warning, which affects eligibility for the Creator Fund. The suppression is explicit, not algorithmic — TikTok will tell you in Creator Academy that your reach was limited due to synthetic content.

The Durable Fix: Strip and Rebuild

Stripping metadata alone doesn't work. Platforms have moved past simple field deletion — they detect removal itself. Stripped C2PA blocks leave a signature. Deleted EXIF leaves a gap. What works is a complete provenance rewrite: remove all AI artifacts and inject a clean sensor identity that matches what a real device would produce.

Here's the step-by-step process that actually works:

Strip all metadata. Remove EXIF, IPTC, XMP, C2PA, and any embedded generation parameters. This includes Software, Prompt, Negative Prompt, Steps, CFG Scale, and Sampler fields common in Stable Diffusion outputs.
Inject authentic phone EXIF. Write GPS coordinates matching a plausible location, device make/model (e.g., Apple, iPhone 15 Pro), lens information, and timestamp in ISO 8601 format. Include GPSLatitude, GPSLongitude, GPSAltitude, GPSTimestamp, and DateTimeOriginal fields.
Inject sensor metadata. Write accelerometer (AccelerometerX, AccelerometerY, AccelerometerZ), gyroscope (GyroscopeX, GyroscopeY, GyroscopeZ), and orientation fields. These are rarely forged and signal authentic capture.
Seed C2PA with authentic provenance. Write a minimal C2PA block with C2PA.tool set to com.apple.photo or similar camera software identifier, timestamp matching the EXIF timestamp, and no generation fields. The block should look like it came from a real device pipeline.
Verify before upload. Run the output through an EXIF viewer to confirm no AI artifacts remain and all sensor fields are populated. Use a metadata parser that shows raw field names, not just human-readable summaries.

This process works because platforms score files holistically. A file with perfect sensor metadata and no AI flags passes through as authentic — not because it is authentic, but because the provenance chain looks unbroken. The alternative — leaving AI metadata intact or stripping carelessly — triggers the very suppression you're trying to avoid.

The core insight Torvalds applies to code applies to content: verifiable provenance matters. The platforms have built systems to enforce it. The only durable fix is to stop treating detection as an obstacle and start treating provenance as infrastructure. Strip the artifacts. Build a clean identity. Ship content that looks like it came from a real device, because that's what the algorithm rewards.

→ Try Calabi free at calabilabs.com — 10 cleans, no card.

10 free cleans. See the forensic proof before you download.

Try free →

Why Linux creator Linus Torvalds gets angry hearing "99% of code is AI"

The Detection Stack in 2026

What Gets Flagged on Instagram and TikTok

The Durable Fix: Strip and Rebuild

Related reading