Real-Time Classification of Real-Time Communications

This webpage contains additional material of the paper:

Real-Time Classification of Real-Time Communications

(currently under revision)

Year: 2021

Authors: Gianluca Perna, Dena Markudova, Martino Trevisan, Michela Meo, Paolo Garza, Maurizio Munafò, Giovanna Carofiglio

The data

You can find the dataset and the classifiers on this link:

Data and classifiers

The link contains:

  • webex_dataset.csv – a csv file of all 96 features described in the paper, the timestamps, label and video quality calculated for all Webex traffic
  • webex_classifier_trained.pkl – a pickle file of the trained classifier for Webex
  • jitsi_dataset.csv – a csv file of all 96 features described in the paper, the timestamps, label and video quality calculated for all Jitsi traffic
  • jitsi_classifier_trained.pkl – a pickle file of the trained classifier for Jitsi

Note that the classifiers are trained on a subset of their respective dataset (a training set) using only 8 features for Webex and 4 features for Jitsi, as explained in the paper. The time aggregation in the provided csv files is 1s.

Features used for WebexFeatures used for Jitsi
interarrival_len_unique_percent
len_udp_p25
len_udp_p70
len_udp_p75
len_udp_len_unique_percent
rtp_interarrival_p30
rtp_interarrival_len_unique_percent
rtp_marker_sum_check
len_udp_mean
len_udp_p25
len_udp_len_unique_percent
rtp_interarrival_len_unique_percent

The code

The tool used to calculate the features from raw pcap files and log files, by the name of Retina, is available on Github:

Retina


All the features

For those interested of the full list of features, before doing feature selection, here is a table:

Group Features Support Support Error Description
label [0,1,2,3,4,5,6,7] -1 Class label
Interarrival Difference between the currently packet time and the previous one
interarrival_std [0, +inf) -1 Interarrival standard deviation (Jitter)
interarrival_mean [0, +inf) -1 Interarrival mean
interarrival_min [0, +inf) -1 Interarrival min
interarrival_max [0, +inf) -1 Interarrival max
interarrival_count [0, +inf) -1 Counter of how many interarrival we have in a second (should be the same with num_packet)
interarrival_kurtosis [0, +inf) -1 Kurtosis
interarrival_skew [0, +inf) -1 Skewness
interarrival_moment3 [0, +inf) -1 Third moment
interarrival_moment4 [0, +inf) -1 Forth moment
interarrival_max_min_diff [0, +inf) -1 Difference between max and min value in a second of flow
interarrival_max_min_R [0.5, 1] -1 max/(max+min)
interarrival_min_max_R [0, 0.5] -1 min/(min+max)
interarrival_len_unique_percent % -1 % of how many value are different in a second of flow
interarrival_max_value_count_percent % -1 % of the times that the maximum value appear in a second of flow
Lenght UDP statistics about UDP lenght of packets
kbps [0, +inf) -1 bitrate
len_udp_std [0, 1500^2] -1 Standard deviation of length of udp packets in a second of flow
len_udp_mean [0, 1500] -1 Mean of lenght of udp packets in a second of flow
len_udp_min [0, 1500] -1 minimum value of length of udp in a second of flow
len_udp_max [0, 1500] -1 maximum  value of length of udp in a second of flow
num_packets [0, +inf) -1 number of packets in a second of flow
len_udp_kurtosis [0, 1500] -1
len_udp_skew [0, 1500] -1
len_udp_moment3 [0, 1500] -1
len_udp_moment4 [0, 1500] -1
len_udp_max_min_diff [0, 1500] -1
len_udp_max_min_R [0.5, 1] -1
len_udp_min_max_R [0, 0.5] -1
len_udp_len_unique_percent % -1
len_udp_max_value_count_percent % -1
Interlength Statistics about difference between length of current packet and the previous one
interlength_udp_std (-inf, +inf)
interlength_udp_mean (-inf, +inf)
interlength_udp_min (-inf, +inf)
interlength_udp_max (-inf, +inf)
interlength_udp_count (-inf, +inf)
interlength_udp_kurtosis (-inf, +inf)
interlength_udp_skew (-inf, +inf)
interlength_udp_moment3 (-inf, +inf)
interlength_udp_moment4 (-inf, +inf)
interlength_udp_max_min_diff (-inf, +inf)
interlength_udp_max_min_R [0.5, 1] -1
interlength_udp_min_max_R [0, 0.5] -1
interlength_udp_len_unique_percent % -1
interlength_udp_max_value_count_percent % -1
RTP inter timestamp Difference between the currently packet rtp timestamp and the previous one
rtp_inter_timestamp_num_zeros [0, +inf) -1
rtp_inter_timestamp_std [0, 2^64] -1
rtp_inter_timestamp_mean [-2^32, 2^32] -1
rtp_interarrival_min [-2^32, 2^32] -1
rtp_interarrival_max [-2^32, 2^32] -1
rtp_interarrival_count [-2^32, 2^32] -1
rtp_interarrival_kurtosis (-inf, +inf)
rtp_interarrival_skew (-inf, +inf)
rtp_interarrival_moment3 (-inf, +inf)
rtp_interarrival_moment4 (-inf, +inf)
rtp_interarrival_max_min_diff (-inf, +inf)
rtp_interarrival_max_min_R [0.5, 1] -1
rtp_interarrival_min_max_R [0, 0.5 -1
rtp_interarrival_len_unique_percent % -1
rtp_interarrival_max_value_count_percent % -1
rtp_marker_sum_check [0, +inf) -1
rtp_seq_num_packet_loss % -1
rtp_csrc_csrc_agg -1
Inter time sequence Difference between the sequence number and rtp timestamp of the current packet
inter_time_sequence_std (-inf, +inf)
inter_time_sequence_mean (-inf, +inf)
inter_time_sequence_max (-inf, +inf)
inter_time_sequence_count (-inf, +inf)
inter_time_sequence_kurtosis (-inf, +inf)
inter_time_sequence_skew (-inf, +inf)
inter_time_sequence_moment3 (-inf, +inf)
inter_time_sequence_moment4 (-inf, +inf)
inter_time_sequence_max_min_diff (-inf, +inf)
inter_time_sequence_max_min_R [0.5, 1] -1
inter_time_sequence_min_max_R [0, 0.5] -1
inter_time_sequence_len_unique_percent % -1
inter_time_sequence_max_value_count_pe,rcent % -1
inter_time_sequence_min (-inf, +inf)

Hope you enjoyed this post and the paper itself! For more info you can always contact us by email.