Personal tracking devices

A Journey Into The True Dark Net

Silvia Puglisi - [email protected] / @nopressure

Hello World

My name is Silvia Puglisi. I am a software engineer and Ph.D. candidate at UPC Barcelona Tech.

I research in privacy and web science.

I am here to talk you about the real dark web.

What is this all about?

  • Marketing
  • Privacy
  • User tracking
  • Online footprint
  • Identity
  • Control

What has marketing to do with this?

Think of the so-called new economy without online marketing and advertising..

Aehm.. You probably can't.

Most of the successful online companies of the last.. make it 20 years.. use advertising to sustain their business at least in part.

The source of advertising

Advertising wants you to buy products.

So advertising companies are doing all that they can to know you so they can recommend you products that you are more likely to buy.

The source of advertising is data.

Data about you.

How does Online Marketing work?

eMarketing follows users in their online and sometimes offline activities.

Information about what users do and are interested in are collected by websites, applications, devices.

This information is crawled, analysed and categorised.

What about online privacy?


If our actions on the web are constantly collected and analysed. Do we have online privacy? Have we lost our right to be anonymous?

In an online context, the right to privacy has commonly been interpreted as a right to “information self-determination”.

Acts typically claimed to breach online privacy concern the collection of personal information without consent, the selling of personal information and the further processing of that information.

Is Privacy the right to be forgotten?

In 2011, the amount of digital information created and replicated globally exceeded 1.8 zettabytes (1.8 trillion gigabytes).

75% of this information is created by individuals through new media fora such as blogs and via social networks.

By the end of 2011, Facebook had 845 million monthly active users, sharing over 30 billion pieces of content.

Library Briefing - Library of the European Parliament - 01/03/2012

A few years ago on the internet nobody knew you were a dog.


Now they probably also know the colour of your fur.

This is the Dark Web


Or at least this is how "they" pictured it.

The dark web is the web that cannot be crawled.

In a way is the web that companies cannot reach and control.

Let us switch this myth around

What if things were actually a bit different..

The Dark Web of Marketing

The Dark Web of Marketing

The Dark web of marketing is this concept that we use software and hardware that we do not control.

  • We do not know how these object are made.
  • We cannot make modifications.
  • They collect a huge amount of data about ourselves.
  • We are perfectly content with it.

The age of the metadata

“meta-data” is collected and stored by public and private organisations about where, when and who created and accessed a particular online content.

Websites have embeded structured data for a few years now.


Structured data is used to describe product, services, events, and make user information available already into their HTML pages using markup standards such as Microformats, Microdata and RDFa.

How does convertions tracking work?

Let’s say your Google search ads send people to your website where they research and learn more about your business. Website call conversions dynamically inserts a Google forwarding number on your website that measures the calls made by these customers. Whether they click on the number or dial it directly from their phone, you can attribute the call conversion and conversion value back to the keyword and ad that drove the customer. You can learn more about setting up website call conversions here.

Mobile app data

Wearables for wellbeing

Wearables for productivity


Wearables for ...


Wearables for ...


Every http connection in the browser has different selector associated with it

Hyperdata && Hypermedia

Hyperdata indicates data objects linked to other data objects in other places, as hypertext indicates text linked to other text in other places.

Hyperdata enables formation of a web of data, evolving from the "data on the Web" that is not inter-related (or at least, not linked).

Hypermedia, an extension of the term hypertext, is a nonlinear medium of information which includes graphics, audio, video, plain text and hyperlinks.

- Source: Wikipedia

RESTful architectures

REST, an architectural style introduced by Roy Thomas Fielding in 2000, which has been at the core of the web design and development.

REST represents an abstraction over the actual architecture of the web.

In REST identification, representation and format are independent concepts.


  • An URI can identify a resource without knowing what formats the resource uses to exchange representations.
  • Likewise the protocols and representations used by the resource to communicate can be modified independently from the URI identifying the resource.

REST interfaces

The uniformity of REST interfaces is build upon four guiding principles:

  • The identification of resources through the URI mechanism.
  • The manipulation of resources through their representations.
  • The use of self-descriptive messages.
  • Implementing hypermedia as engine of the application state (HATEOAS)

Why hypermedia matters for privacy protection

  • Information self-determination is not even possible if users have no control on their online footprint.
  • Hypermedia provides context over unstructured footprint information.
  • Users and applications use REST interfaces to interact with one another exchanging resource representations.
  • The web follows REST principles and so do users’ online traces.

Each selector contribute to the build a user profile.

Each selector makes your profile more unique, because it ads up more information.

The Identity Hypergraph

In mathematics, a hypergraph is a generalization of a graph in which an edge can connect any number of vertices. Formally, a hypergraph H is a pair H = (X,E) where X is a set of elements called nodes or vertices, and E is a set of non-empty subsets of X called hyperedges or edges. Therefore, E is a subset of P(X) \ {0}, where P(X) is the power set of X.


Unique footprints

While you surf the web you carry a unique footprint:

  • HTTP cookies, often set by with 3rd party analytics and advertising domains
  • sessions and storage.
  • Browser or device personalised settings. Resolution, Fonts, Plugins.
  • Accounts ID.
  • Devices ID.


How unique is a footprint?

To answer this question we can do a set of different things.

  • We can profile our activity within a set of categories
  • We can calculate how many bits of information are introduced everytime we add a feature to our profile.
  • We can analyse how many unique feature we are sharing across the network.

Profiling over a set of categories

Calculating how many bits of information are introduced by unique features


Analysing how many unique traces we leave around the web

What is the cloud really?

Who owns the cloud?

What if we go beyond cloud providers?


What about mobile communication provider?

Due to economy of scale property of telecommunication industry, sharing of telecom infrastructure among telecom service providers is becoming the requirement and process of business in the telecom industry where competitors are becoming partners in order to lower their increasing investments.



I grew up with the understanding that the world I lived in was one where people enjoyed a sort of freedom to communicate with each other in privacy, without it being monitored, without it being measured or analyzed or sort of judged by these shadowy figures or systems, any time they mention anything that travels across public lines.

- Edward Snowden

  • Research in open infrastructure
  • Collaborate with researchers outside of computer/telecom science/eng
  • Be mindful about your online footprint


This presentation is available on: