Browser fingerprints in a nutshell

Internet privacy has been a recurrent subject over the last years, as multiple social media, such as Facebook, Twitter and others, have encountered themselves trapped in a tonload of controversies and have been under the spotlights since then.
This is basically showing a trend : the lambda user is waking up and looking at his internet privacy from a new perspective. Regular users are no longer looking at their privacy as a sacrifice they have to make in order to use their favorite social media, but rather as an aspect of the internet they have to be able to control.
One of the least known privacy breach are represented by the browser fingerprints. As most users tend to focus on securing the data they know about, many have no idea about some of the newest methods used.

Internet privacy has been a recurrent subject over the last years, as multiple social media, such as Facebook, Twitter and others, have encountered themselves trapped in a tonload of controversies and have been under the spotlights since then.
This is basically showing a trend : the lambda user is waking up and looking at his internet privacy from a new perspective. Regular users are no longer looking at their privacy as a sacrifice they have to make in order to use their favorite social media, but rather as an aspect of the internet they have to be able to control.
One of the least known privacy breach are represented by the browser fingerprints. As most users tend to focus on securing the data they know about, many have no idea about some of the newest methods used.

What are browser fingerprints ?

Browser fingerprints are a set of data that your prowser gently provide to any website that asks for them in the appropriate way. The user have no idea of the process as everything runs in background, and involves no prompting of the user.

Browser fingerprinting have been around for quite some time now, and in the majority of cases, they are used in a perfectly ethical way : websites tend to ask for them in order to adapt and maximise your confort use.
However, some other malicious websites may make a totally different use of them, and use the data given by your browser in order to identify you without the use of cookies, and generate specific ads and website recommendations. Google make use of these data in order to display all the ads you are seeing in different websites, as stated in this answer provided by a DuckDuckGo developer.
As a matter of fact, you probably witnessed the use and the good side of browser fingerprinting, with Netflix or Facebook notifiying you each time a new device, different from the ones you usual access these platform, is used to access your account.
As browsers tend to represent our access door to the web, browsers fingerprinting turns out to be a real privacy breach that is getting more and more generalized.

But, how are websites able to use these fingerprints to track you ?

Well, let’s review this description provided by the “Beauty and the Beast: Diverting modern web browsers to build unique browser fingerprints” research paper :

Browser fingerprinting consists in collecting data regarding the configuration of a user’s browser and system when this user visits a website. This process can reveal a surprising amount of information about a user’s software and hardware environment, and can ultimately be used to construct a unique identifier, called a browser fingerprint.

So, browser fingerprints represent a set of data displaying informations about the user’s environment . This may sound harmless, but what data are we talking about here ?
A set of data constituting a fingerprint may be composed of :

  • Http headers : The http headers are the data the browser sends in order to request a website. The http headers include themselves a whole set of data, including the user-agent, which is basically an information about the browser used, the operating system, the version of the browser, along with a list of extensions used. Some constructors decide to introduce some personalized characteristics, making this field very powerful in the case of browser fingerprinting. The http headers also include the language of the browser, and multiple extra details about it.
  • Platform : The platform field is analog to the data contained in the user-agent of the http headers. However, the importance of this field is that it allows us to check for inconsistencies that may be introduced by plugins built to fight malicious use of browser fingerprints.
  • The presence of an ad blocker, or any software that may be able to give an indication about wether the user “cares” about the privacy issues of internet use.
  • WebGL : Generalized not so long ago, the WebGL API allows to render 3D shapes using the user’s GPU. The API is used in the case of browser fingerprinting in order to get informations about the GPU, such as the model, the vendor and the underlying software version.
  • Canvas : Canvas fingerprinting is a novel and powerful tool used to generate 2D figures over the browser in order to get specific informations about the system.

Some of the previous field benefits from a very high entropy, meaning their importance into the built of the fingerprint is of high importance, while some others, with lower entropy, are basically used in order to check for inconsistencies. Other values for other attributes exists, such as the screen resolution, wether the user allows cookies or no, wether Flash is activated, etc…

Putting together these data allow us to build a full browser fingerprint, that, in most of the case, is unique, thanks to the combination of multiple specific values. For example, one of the value that is most likely to differ between two users is the user-agent, as many laptop, and phone constructors tend to include their characteristics, making it fairly unique.

For instance, my user-agent looks like this and has a similarity ratio smaller than 0.1%. Checking your can be done using the AmIunique website :

"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36 OPR/54.0.2952.64"

How does Canvas fingerprinting work ?

Example of the generated canvas used for canvas fingerprinting

Canvas fingerprinting consists in generating a pretty similar image to the previous one, allowing a defined script to exploit the informations generated by the system in order to build a set of data about the user environment.
Canvas fingerprinting has a pretty high entropy and it’s use is getting more and more generalized.

It basically make use of 2D tests, using specific characters to an operating system in order to extract informations. For instance, the use of the emoji there may look harmless but is one, if not the, most important character of the whole set.
Emojis are implemented differently among different operating systems, and among different constructors : Apple products don’t display the same emojis as a Samsung smartphone, or a Sony smartphone.

How emojis are represented on different platforms

Canvas fingerprinting also makes use of fonts fallback, as it will order the browser to use a fake font, which will make it use a fallback font. Once again, operating systems tend to use different fallback fonts, making it useful at identifying the type of device the user is having.

Conclusion

Yes, websites are able to track you using your browser fingerprints, by constantly checking for them in a previously defined database, containing your use habits, and multiple other informations.
Obviously, not every website makes a malicious use of these fingerprints, but it is always useful to know about them.

Many tools exists to make it harder for website to extract your informations, and some basic gestures, such as desactivating Flash if your browser still implements it, allows you to make your fingerprint more general, thus harder to track.

Sources