OpenAI's new image watermarks make it easier to spot AI fakes – here's how

OpenAI's new image watermarks could make AI fakes harder to hide — Elyse Betters Picaro / ZDNET

Follow ZDNET: Add us as a preferred source on Google.

ZDNET’s key takeaways

OpenAI now uses C2PA metadata and SynthID watermarks.
Hidden pixel signals can help identify AI-generated images.
A public OpenAI verification tool is also rolling out.

Today, OpenAI announced what it calls content provenance signals across its image ecosystem. In other words, it’s tagging its AI-generated images as AI-generated.

This is not new. OpenAI and other AI tools have been embedding metadata in AI-generated images since 2024. The problem was that the metadata tagging was pretty easy to defeat. What is new is that OpenAI is upping its image ID security game with some fancy new tech.

Also: I tested ChatGPT Images 2.0 vs. Gemini Nano Banana to see which is better – this model wins

There’s a lot going on here. To help put it in perspective, we’re going to travel all the way back to 440 BC and one dude’s bad hair day.

Stega what, now?

Steganography is the practice of embedding cryptographic information in plain sight, basically using techniques to conceal messages in such a way that the cryptographic intent of the messages is not immediately apparent. In other words, knowing someone or something is carrying a code is halfway to cracking the code.

According to modern research, in the fifth century BCE, Herodotus of Halicarnassus, writing in the books Terpsichore and Polymnia of his nine-book Histories, tells the story of how, “Around 440 B.C. Histiæus shaved the head of his most trusted [assistant] and tattooed it with a message which disappeared after the hair had regrown. The purpose was to instigate a revolt against the Persians.” Apparently, this technique was used as recently as World War II.

Also: I tested ChatGPT Plus vs. Gemini Pro to see which is better – and if it’s worth switching

If you’ve ever watched a TV detective show where a hidden message is revealed by reading every few letters of an otherwise ordinary note, you’ve seen a text-based example of steganography. As encryption goes, it’s weak. But if you don’t know there’s a message in the note, you might not try to decrypt it.

Steganography has been used in digital images for years to embed text information among the millions of pixels that make up a picture. This allows senders to embed images that are displayed in plain sight. It also allows creators to embed ownership and origination information into an image in a way that’s very difficult to defeat.

We’ll come back to steganography in a moment because it’s key to today’s OpenAI announcement.

But first, let’s go back to the future, but not all the way. Our next stop is 2024.

Show me the metadata

OpenAI has been embedding metadata in images generated by DALL-E 3, ImageGen, and Sora since 2024. You can use a tool like Content Credentials to examine that data. Google’s Nano Banana and other image-generating AI tools also embed some metadata in their images.

Also: I stopped using ChatGPT for everything: These AI models beat it at research, coding, and more

Here’s an example of images generated by ChatGPT on the left and Nano Banana on the right. As you can see, the metadata is properly available. Content Credentials can display the data.

On the other hand, when I took a screenshot of each image, which captured the pixels but not the underlying metadata, Content Credentials merely reported “Something went wrong.” The image capture completely eliminated the metadata associated with the original image file.

wrong — I’m sorry, Dave. I’m afraid I can’t do that.

Screenshot by David Gewirtz/ZDNET

This, among other things, is what OpenAI and Google are trying to fix.

According to OpenAI, “We’ve been building toward this for some time. We have used visible watermarks in Sora and an audio watermark in Voice Engine, and have continued to test and research accuracy and reliability over time, through deployment.”

Standard metadata formatting

OpenAI says, “We recently took the step of making OpenAI a C2PA Conforming Generator Product. By becoming C2PA conformant, we are giving platforms a trusted way to read, preserve, and pass along the provenance information we attach to our content.”

Let’s unpack that. C2PA is the Coalition for Content Provenance and Authenticity. It has a C2PA Conformance Program, which “provides assurance that products adhere to the Content Credentials specification, and fulfill a set of security requirements to ensure they are producing and validating C2PA data correctly.”

Also: How to learn ChatGPT in an hour – for free

In other words, the content metadata is standardized, secure, and contains enough information to make it useful. OpenAI is doing this for all its image offerings. Its PR rep told me, “all images generated by ChatGPT and OpenAI (including the OpenAI API and Codex) contain these provenance signals.”

Signals. Plural. That brings me to the big hammer of this announcement.

Hidden digital watermarks

Google DeepMind’s SynthID is a multimodal digital watermarking mechanism that embeds invisible digital watermarks in text, images, video, and audio. This is some snazzy tech. Interestingly, given that Google and OpenAI are arch-competitors, OpenAI is now incorporating SynthID technology in all the images the company generates.

For images, SynthID is pixel-based. A subtle steganographic-like signal is embedded into images right when they’re generated. The identity data is imperceptible to the human eye, but detection tools can read the data. This digital watermark remains in the image even after resizing, cropping, compression, and color adjustments. It transfers to screenshots. The digital signature is baked into the entire image, rather than just showing up in a small area of the image.

Also: I tried ChatGPT Images 2.0: A fun, huge leap – and surprisingly useful for real work

So even though Nano Banana puts its little diamond in the corner of images it generates, it also embeds a much more comprehensive signal throughout the entire image.

There’s one additional fascinatingly powerful aspect of SynthID that OpenAI didn’t mention: SynthID can watermark text, apparently without affecting the quality of the text. What it does is very subtly choose which token is used in each block of text so that what’s generated can be scanned to find a statistical signature that detector software can identify. This capability has not been announced by OpenAI and is therefore probably not used in ChatGPT, but it is used in Gemini.

As with C2PA, OpenAI is embedding SynthID into images generated through ChatGPT, Codex, and the OpenAI API.

New public verification tool

Concurrent with the announcement of the C2PA compliance and SynthID capabilities, OpenAI is announcing the availability of a public verification tool you can use to see if something was generated by one of OpenAI’s AI tools.

I’m writing this the night before the official announcement goes public. By the time you read this article, you should be able to test the tool at https://openai.com/research/verify/.

Also: I compared how Gemini, ChatGPT, and Claude can analyze videos – this model wins

I’m very curious about the limits of this tool and also how well it works in concert with SynthID. What happens, for example, if you pull part of an image from ChatGPT and use it with a real photograph as part of a Photoshop composition? Does it report how much was AI tagged? We’ll check back on this with real-world tests at some point after the tool is released.

According to OpenAI, “No single provenance technique is enough on its own. We believe a strong approach combines shared standards, durable watermarking signals, and public verification. By building on our long-standing support for Content Credentials, becoming conformant with C2PA, adopting SynthID, and previewing public verification tooling, we hope to contribute in the long run to a more interoperable provenance ecosystem.”

Would you check an image’s provenance if a detection tool made it easy? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.