BipHoo CA

collapse
Home / Daily News Analysis / OpenAI's new image watermarks make it easier to spot AI fakes - here's how

OpenAI's new image watermarks make it easier to spot AI fakes - here's how

May 24, 2026  Twila Rosenbaum  9 views
OpenAI's new image watermarks make it easier to spot AI fakes - here's how

OpenAI has taken a significant step forward in the fight against AI-generated misinformation by introducing robust content provenance signals across its entire image ecosystem. The company now embeds two layers of verification in images created by ChatGPT, the OpenAI API, and Codex: standard C2PA metadata and Google DeepMind's SynthID digital watermarks. Together, these techniques make it far more difficult for malicious actors to pass off AI-generated images as authentic photographs.

The announcement marks a shift from earlier, easily removable metadata to a durable, pixel-level watermark that persists even after screenshots, cropping, or compression. This evolution is critical as AI-generated images become increasingly realistic and harder to distinguish from real photos.

The Problem with Traditional Metadata

Since 2024, OpenAI and other AI image generators have embedded metadata tags—machine-readable information stored in the file's header. Tools like Content Credentials could read these tags to confirm an image's AI origin. However, this approach had a major weakness: metadata is fragile. Taking a screenshot of an image strips out the metadata entirely, as does resaving the file in a different format or uploading it to social media platforms that recompress images.

For example, an image generated by ChatGPT's DALL-E 3 might contain proper metadata when downloaded directly. But if someone takes a screenshot of that image on their phone and shares it, the metadata is gone. The screenshot captures only the pixel data, not the embedded file information. This loophole allowed AI-generated fakes to circulate without any provenance clue.

Steganography: Hiding Signals in Plain Sight

To solve this, OpenAI has turned to an ancient technique called steganography. The practice dates back to around 440 BC, when the Greek historian Herodotus described how Histiaeus shaved a messenger's head, tattooed a message on his scalp, and waited for the hair to grow back before sending him off. The hidden message was invisible until the head was shaved again.

Modern digital steganography works similarly by embedding information into the pixels of an image in a way that is imperceptible to the human eye. Instead of hiding a message in hair, it alters the color values of individual pixels by tiny amounts. These changes are too small for a person to notice but can be detected by software that knows the pattern.

OpenAI's new approach uses this principle through Google DeepMind's SynthID technology. Originally developed by Google for its own Gemini models, SynthID has now been licensed by OpenAI to watermark all images generated by its tools. The watermark is applied at the moment of generation, subtly modifying pixel values across the entire image. Because the signal is spread throughout the picture, it survives resizing, cropping, compression, and even screenshots.

How SynthID Works in Detail

SynthID is a multimodal watermarking system designed for text, images, video, and audio. For images, it works by introducing statistical patterns into the pixel data. These patterns are invisible to the human eye but can be reliably detected by a companion classifier. The watermark is not a simple overlay; it is integrated into the image's content itself. Even if someone takes a screenshot of a watermarked image, the patterns remain because the screenshot captures the altered pixels.

This is a major improvement over older techniques like visible watermarks (e.g., a logo in the corner) or fragile metadata. Google's Gemini Nano Banana, for instance, places a small diamond logo in the corner of generated images—but that logo can be cropped out. SynthID's pixel-level watermark cannot be easily removed without degrading the image quality to an unusable degree.

Another powerful aspect of SynthID is its ability to watermark text. The technology can subtly select which tokens to use in a response so that the generated text contains a statistical signature. This allows detection software to identify AI-generated text without affecting readability. While OpenAI has not yet adopted SynthID text watermarking for ChatGPT, Google uses it in its Gemini models. If OpenAI eventually adds this capability, it could help combat AI-generated misinformation in written content as well.

C2PA Compliance: Standardizing Provenance

Alongside the SynthID watermark, OpenAI has made its image generation pipeline compliant with the Coalition for Content Provenance and Authenticity (C2PA) standard. C2PA provides a unified specification for embedding metadata that includes information about the creator, the tool used, and any modifications made. By becoming a C2PA Conforming Generator Product, OpenAI ensures that its metadata is secure and interoperable across platforms that support the standard.

The C2PA standard uses cryptographic signatures to prevent tampering. When an image is created, the metadata is signed with a private key. Verification tools can check the signature against a public key to confirm that the metadata has not been altered. This makes it much harder for attackers to strip or modify the provenance information.

OpenAI's implementation covers all its image generation services, including ChatGPT, the API, and Codex. This means that every AI-generated image from these sources now carries both C2PA metadata and a SynthID watermark, providing a multi-layered defense against forgery.

The Public Verification Tool

To make these signals useful, OpenAI is launching a public verification tool at https://openai.com/research/verify/. Users can upload an image or provide a URL, and the tool will check for both C2PA metadata and SynthID watermarks. If the image was generated by an OpenAI tool, the tool will report a positive verification. If the image lacks these signals, the tool will indicate that it could not confirm AI generation.

This tool is a crucial part of the ecosystem. Even with robust watermarking, the ability to easily check an image's provenance is essential for journalists, fact-checkers, and ordinary users. Without a simple verification method, the watermarks would be of little practical use. OpenAI's tool aims to make provenance checking as straightforward as possible.

However, the tool has limitations. For example, if an image is a composite—part AI-generated and part real photograph—the tool may not be able to determine the exact proportion of AI content. OpenAI acknowledges that no single provenance technique is enough on its own. Combining C2PA metadata, SynthID watermarks, and public verification provides a strong foundation, but ongoing improvements will be needed as adversaries develop new ways to circumvent these measures.

Historical Context: The Arms Race of AI Detection

The challenge of identifying AI-generated content is not new. Since the rise of Generative Adversarial Networks (GANs) in the mid-2010s, researchers have been developing methods to detect deepfakes. Early techniques looked for visual artifacts like inconsistent lighting, unnatural eyes, or missing reflections. As generative models improved, these artifacts became harder to spot, prompting the need for more sophisticated detection.

Watermarking is a proactive approach: instead of trying to detect fakes after they are created, it embeds identification signals at the source. This is analogous to the way currency uses watermarks and security threads to prevent counterfeiting. The challenge is that watermarking must be robust enough to survive common image manipulations while remaining invisible to the casual observer.

SynthID represents a major advance in this area because it uses deep learning to optimize the watermark for robustness. The system is trained to embed signals that withstand JPEG compression, color adjustments, and even screenshots. Tests have shown that SynthID watermarks can survive up to 90% cropping and still be detected with high accuracy.

Implications for Misinformation and Trust

The introduction of durable watermarks is a positive development for online trust. As AI-generated images become indistinguishable from real ones, the ability to verify an image's origin becomes crucial for news organizations, social media platforms, and individuals. Without such tools, AI fakes could be used to spread false information, manipulate public opinion, or commit fraud.

For example, a fake image of a politician engaging in illegal activity could go viral if it cannot be identified as AI-generated. With SynthID and C2PA, platforms like Facebook, Twitter, and news sites could automatically check images and display a warning if they are AI-generated. This would give users the context they need to evaluate the image's credibility.

However, these measures are not foolproof. Adversaries may attempt to remove or overwrite watermarks, or they may generate images using tools that do not embed them. OpenAI's approach relies on widespread adoption of standards like C2PA and SynthID. If other AI companies follow suit, the overall trustworthiness of online imagery will increase. But if some continue to produce unwatermarked content, the system will have blind spots.

Another challenge is that watermarking only identifies the origin of an image, not its truthfulness. An AI-generated image of a fictional event is not necessarily harmful—it could be used for art, entertainment, or education. The watermark simply tells viewers that the image was created by an AI, not that it is false. Users still need to apply critical thinking and context to assess the image's meaning.

Technical Details and Future Directions

For those interested in the technical underpinnings, SynthID works by modifying the pixel values in a way that corresponds to a specific code. The code is generated at random and stored in a database linked to the image. When a verification tool scans the image, it extracts the code and checks it against the database to confirm the image's origin. This approach avoids the need for a centralized registry while still providing a reliable method of verification.

OpenAI has not released detailed information about the exact implementation of SynthID in its pipeline, but it has confirmed that the watermarking is applied to all images generated through its services, including those created by DALL-E 3, ImageGen, and Sora. The company also plans to extend similar protections to video and audio in the future.

The collaboration between OpenAI and Google on SynthID is notable given their competitive relationship. It suggests that the industry recognizes the need for shared standards when it comes to content provenance. By adopting a technology developed by a rival, OpenAI is prioritizing the effectiveness of the watermark over proprietary concerns.

Users can test the new verification tool once it goes live. Early tests will likely reveal how well the watermarks survive various transformations, including those performed by social media platforms like Instagram, TikTok, and X. The tool's accuracy will be crucial for its adoption by fact-checkers and journalists.

In the long run, the combination of C2PA metadata and SynthID watermarks could become the default for AI-generated content. If other major AI companies—such as Meta, Anthropic, and Midjourney—adopt similar approaches, the ecosystem will have a robust system for identifying synthetic media. This would mark a significant victory in the battle against AI-driven misinformation.

For now, OpenAI's announcement is a promising step. The company is not claiming that its solution is perfect, but it is clearly committed to improving transparency. As the technology evolves, we can expect even more sophisticated watermarking techniques, perhaps incorporating blockchain-based verification or biometric signatures.

The focus on provenance is not just about detecting fakes—it is about building trust in the digital world. When users know that they can verify the origin of an image, they can make informed decisions about what to believe. OpenAI's new watermarks are a tool toward that goal, but the responsibility still rests with users to remain vigilant and critical."


Source: ZDNET News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy