LEARN

How video fingerprinting works

A deeper look at why encrypted video traffic still leaks what you're watching.

The shape of a video

Modern video streaming uses DASH (Dynamic Adaptive Streaming over HTTP). Videos are split into many short segments, each a few seconds long, and your browser fetches them one at a time as you watch.

Two facts about those segments make fingerprinting possible:

  • Every segment in a given representation has the same duration.
  • Their sizes vary and are correlated with context complexity.

At a high level, then, the byte counts that are sent over the network are essentially a time series of the video's visual complexity. Different videos have different complexity profiles, and that's enough to tell them apart.

Why encryption doesn't fix this

HTTPS encrypts the bytes inside each segment — the actual video data — but it does not hide the size of the encrypted data. An observer can still see "the browser fetched a 412 KB segment, then a 380 KB segment, then a 95 KB segment...". The pattern of sizes is the fingerprint, and the bytes themselves don't need to be readable.

VPNs and Tor don't change the underlying signal: an encrypted, anonymized tunnel still carries the segment sequence at a granularity that attackers can use.

What it takes to identify a video

An attacker needs three things:

  1. Access to the network between you and the video server to see your segment requests.
  2. A known catalog of fingerprints, which they can build by watching the same content themselves.
  3. A classifier that maps "observed segment size sequence" to "best matching catalog entry".

This is well established in the academic literature, including the paper behind Dodge.

What Dodge changes

A Dodge defense rewrites which segments your browser fetches, and at which sizes. The video you see on screen is the same, but the byte sequence on the wire is different. With a well-designed defense, the wire pattern doesn't match any specific catalog entry.