Domains per web users

This page collects the open datasets used in the papers:

Luca Vassio, Danilo Giordano, Martino Trevisan, Marco Mellia, Ana Paula Couto da Silva, Users’ Fingerprinting Techniques from TCP Traffic, ACM SIGCOMM Workshop on Big Data Analytics and Machine Learning for Data Communication Networks, Los Angeles, USA, August 2017

The anonymized visited domains and the list of core domains used to perform the experiments are reported.

The dataset with the visited domain can be donwloaded from here (link is external) and is composed by 4 columns:

  1. The Client IP address anonymized
  2. The timestamp the flow was generated in seconds
  3. The Domain anonymized as a number between 000001 and 500k
  4. The a flag stating if the Domain is a Core Domain (True) or a Support Domain (False)

The list of 1000 Core Domains can be donwloaded from here (link is external) and is composed by 2 columns:

  1. The domain
  2. If the domain is a Core Domain (Core) or a Support Domain (Support)

For more details, please check the paper or contact us.


Luca Vassio, Flavio Figuereido, Ana Paula Couto da Silva, Marco Mellia, Jussara Almeida, Mining and Modeling Web Trajectories from Passive TracesIEEE BigData 2017 DS4N, Boston, MA, December 2017

Anonymized trajectories of domains and their TribeFlow models are reported.

The dataset with the visited domain can be donwloaded from here (link is external) and is composed by 4 columns:

  1. Timestamp in seconds
  2. The Client IP address anonymized
  3. The original Domain anonymized as a integer number
  4. The landing Domain anonymized as a  integer number

The Tribelow models:

  1. Campus model (download here (link is external))

Leave a comment

Your email address will not be published. Required fields are marked *