SynthID: What it is and how it works

Photo by author

# Introduction

As AI-generated media becomes increasingly powerful and common, it has become increasingly difficult to distinguish AI-generated content from human-generated content. In response to threats such as misinformation, deepfakes, and misuse of synthetic media, Google DeepMind has developed SynthID, a suite of tools that embeds imperceptible digital watermarks into AI-generated content and subsequently enables strong identification of that content.

By incorporating watermarking directly into the content creation process, SynthID helps verify authenticity and supports transparency and trust in AI systems. SynthID spans text, images, audio, and video with watermarks suitable for each. In this article, I’ll explain what SynthID is, how it works, and how you can use it to watermark text.

# What is SynthID?

At its core, Synth Id is a digital watermarking and detection framework designed for AI-generated content. It is a watermarking framework that applies imperceptible signals to AI-generated text, images, and video. These signals avoid compression, resizing, truncation and general transformations. As opposed to a metadata-based approach Alliance for Evidence and Authenticity of Content (C2PA), SynthID works at the model or pixel level. Instead of adding metadata generation after generation, SynthID embeds a hidden signature within the content itself, encoded in a way that is invisible or inaudible to humans but detectable by algorithmic scanners.

SynthID’s design aims to be invisible to users, resilient to distortion, and reliably identifiable by software.

SynthID is integrated into Google’s AI models, including Gemini (text), Imagen (images), Lyria (audio), and Veo (video). It also supports tools like SynthID Detector Portal to verify uploaded content.

// Why SynthID is important

Generative AI can create highly realistic text, images, audio and video that are difficult to distinguish from human-generated content. It brings risks such as:

Deepfake videos and manipulated media
Misinformation and misleading content
Unauthorized reuse of AI content in contexts where transparency is required.

SynthID provides an origin marker that helps platforms, researchers, and users trace the origin of content and classify whether it is synthetically generated.

// Technical principles of SynthID watermarking

SynthID’s watermarking approach is rooted in steganography – the art of hiding signals within other data so that the presence of hidden information is imperceptible but can be retrieved with a key or detector.

The key design objectives are:

Watermarks should not reduce the quality of user-facing content.
Watermarks should avoid common transformations such as compression, cropping, noise and filters.
The watermark should reliably indicate that the content was generated by an AI model using SynthID.

The following describes how SynthID implements these goals in different media types.

# Text media

// Probability based watermarking

SynthID embeds signals during text generation by manipulating the probability distributions used by large language models (LLMs) when choosing the next token (word or token part).

This method takes advantage of the fact that text generation is probabilistic and statistical in nature. Small controlled adjustments do not affect the output quality while providing a recognizable signature.

# Photos and video media

// Pixel level watermarking

For images and video, SynthID embeds a watermark directly into the generated pixels. During generation, for example, through a diffusion model, SynthID subtly changes pixel values at specific locations.

These changes are below human-noticeable differences but encode machine-readable patterns. In video, watermarking is applied on a frame-by-frame basis, allowing for temporal detection even after changes such as cropping, compression, noise, or filtering.

# Audio media

// Visually based encoding

For audio content, the watermarking process takes advantage of the spectral representation of the audio.

Convert the audio waveform to a time-frequency representation (spectrogram).
Encode the watermark pattern within the spectrogram using an encoding technique associated with psychoacoustic (sound perception) features.
Reconstruct the waveform from the modified spectrogram so that the embedded watermark remains imperceptible to the human listener but can be detected by SynthID’s detector.

This approach ensures that the watermark remains recognizable even after changes such as compression, noise addition, or speed changes—although you should be aware that extreme changes can weaken recognition.

# Watermark detection and verification

Once a watermark is embedded, the SynthID detection system inspects a piece of content to determine if the hidden signature is present.

Tools like SynthID Detector Portal allow users to upload media to be scanned for the presence of watermarks. Detection highlights areas with strong watermark signals, allowing for more granular original checks.

# Strengths and limitations of SynthID

SynthID is designed to withstand common content transformations, such as cropping, resizing, and image/video compression, as well as noise addition and audio format conversion. It also handles minor edits and paraphrasing for text.

However, significant changes such as heavy editing, aggressive paraphrasing, and non-AI changes can reduce watermark recognition. Also, SynthID detection works primarily for content generated by models integrated with watermarking systems, such as Google’s AI models. It cannot detect AI content from external models that lack SynthID integration.

# Applications and wider impact

The main use cases of SynthID include the following:

Content authenticity authentication distinguishes AI-generated content from human-generated content.
Combating misinformation, such as tracing the origin of synthetic media used in fraudulent narratives
Media sources, compliance platforms, and regulators can help track the origin of content.
Supporting research and academic integrity, copyrighted and responsible AI use

By embedding persistent identifiers into AI outputs, SynthID increases transparency and trust in generative AI ecosystems. As adoption grows, watermarking may become standard practice in AI platforms in industry and research.

# The result

SynthID AI represents an influential advancement in content traceability, secretly embedding strong, imperceptible watermarks directly into generated media. By leveraging model-specific effects on token probabilities for pixel manipulation for text, images, and video, and spectrogram encoding for audio, SynthID achieves a practical balance of stealth, power, and detectability without compromising content quality.

As creative AI continues to evolve, technologies like SynthID will play an increasingly central role in ensuring responsible deployment, challenging misuse, and maintaining trust in a world where synthetic content is ubiquitous.

Shatu Olomide A software engineer and technical writer with a knack for simplifying complex concepts and a keen eye for detail, passionate about leveraging modern technology to craft compelling narratives. You can also search on Shittu. Twitter.

SynthID Works

# Introduction

# What is SynthID?

// Why SynthID is important

// Technical principles of SynthID watermarking

# Text media

// Probability based watermarking

# Photos and video media

// Pixel level watermarking

# Audio media

// Visually based encoding

# Watermark detection and verification

# Strengths and limitations of SynthID

# Applications and wider impact

# The result

Editor's pick

Get latest news

SynthID: What it is and how it works

# Introduction

# What is SynthID?

// Why SynthID is important

// Technical principles of SynthID watermarking

# Text media

// Probability based watermarking

# Photos and video media

// Pixel level watermarking

# Audio media

// Visually based encoding

# Watermark detection and verification

# Strengths and limitations of SynthID

# Applications and wider impact

# The result

How to Implement the Outbox Pattern in Go and PostgreSQL

The World Still Needs People Who Care – CodePen Founder Chris Coyer Interview (Podcast #212)

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news