Real-Time Classification of Real-Time Communications

This webpage contains additional material of the paper:

“Real-Time Classification of Real-Time Communications“

Published in IEEE Transactions on Network and Service Management, July 2022

Authors: Gianluca Perna, Dena Markudova, Martino Trevisan, Michela Meo, Paolo Garza, Maurizio Munafò, Giovanna Carofiglio

The data

You can find the dataset and the classifiers on this link:

Data and classifiers

The link contains:

webex_dataset.csv – a csv file of all 96 features described in the paper, the timestamps, label and video quality calculated for all Webex traffic
webex_classifier_trained.pkl – a pickle file of the trained classifier for Webex
jitsi_dataset.csv – a csv file of all 96 features described in the paper, the timestamps, label and video quality calculated for all Jitsi traffic
jitsi_classifier_trained.pkl – a pickle file of the trained classifier for Jitsi

Note that the classifiers are trained on a subset of their respective dataset (a training set) using only 8 features for Webex and 4 features for Jitsi, as explained in the paper. The time aggregation in the provided csv files is 1s.

Features used for Webex	Features used for Jitsi
interarrival_len_unique_percent len_udp_p25 len_udp_p70 len_udp_p75 len_udp_len_unique_percent rtp_interarrival_p30 rtp_interarrival_len_unique_percent rtp_marker_sum_check	len_udp_mean len_udp_p25 len_udp_len_unique_percent rtp_interarrival_len_unique_percent

The code

The tool used to calculate the features from raw pcap files and log files, by the name of Retina, is available on Github:

Retina

All the features

For those interested of the full list of features, before doing feature selection, here is a table:

Group	Features	Support	Support Error	Description
	label	[0,1,2,3,4,5,6,7]	-1	Class label
Interarrival				Difference between the currently packet time and the previous one
	interarrival_std	[0, +inf)	-1	Interarrival standard deviation (Jitter)
	interarrival_mean	[0, +inf)	-1	Interarrival mean
	interarrival_min	[0, +inf)	-1	Interarrival min
	interarrival_max	[0, +inf)	-1	Interarrival max
	interarrival_count	[0, +inf)	-1	Counter of how many interarrival we have in a second (should be the same with num_packet)
	interarrival_kurtosis	[0, +inf)	-1	Kurtosis
	interarrival_skew	[0, +inf)	-1	Skewness
	interarrival_moment3	[0, +inf)	-1	Third moment
	interarrival_moment4	[0, +inf)	-1	Forth moment
	interarrival_max_min_diff	[0, +inf)	-1	Difference between max and min value in a second of flow
	interarrival_max_min_R	[0.5, 1]	-1	max/(max+min)
	interarrival_min_max_R	[0, 0.5]	-1	min/(min+max)
	interarrival_len_unique_percent	%	-1	% of how many value are different in a second of flow
	interarrival_max_value_count_percent	%	-1	% of the times that the maximum value appear in a second of flow
Lenght UDP				statistics about UDP lenght of packets
	kbps	[0, +inf)	-1	bitrate
	len_udp_std	[0, 1500^2]	-1	Standard deviation of length of udp packets in a second of flow
	len_udp_mean	[0, 1500]	-1	Mean of lenght of udp packets in a second of flow
	len_udp_min	[0, 1500]	-1	minimum value of length of udp in a second of flow
	len_udp_max	[0, 1500]	-1	maximum value of length of udp in a second of flow
	num_packets	[0, +inf)	-1	number of packets in a second of flow
	len_udp_kurtosis	[0, 1500]	-1
	len_udp_skew	[0, 1500]	-1
	len_udp_moment3	[0, 1500]	-1
	len_udp_moment4	[0, 1500]	-1
	len_udp_max_min_diff	[0, 1500]	-1
	len_udp_max_min_R	[0.5, 1]	-1
	len_udp_min_max_R	[0, 0.5]	-1
	len_udp_len_unique_percent	%	-1
	len_udp_max_value_count_percent	%	-1
Interlength				Statistics about difference between length of current packet and the previous one
	interlength_udp_std	(-inf, +inf)	–
	interlength_udp_mean	(-inf, +inf)	–
	interlength_udp_min	(-inf, +inf)	–
	interlength_udp_max	(-inf, +inf)	–
	interlength_udp_count	(-inf, +inf)	–
	interlength_udp_kurtosis	(-inf, +inf)	–
	interlength_udp_skew	(-inf, +inf)	–
	interlength_udp_moment3	(-inf, +inf)	–
	interlength_udp_moment4	(-inf, +inf)	–
	interlength_udp_max_min_diff	(-inf, +inf)	–
	interlength_udp_max_min_R	[0.5, 1]	-1
	interlength_udp_min_max_R	[0, 0.5]	-1
	interlength_udp_len_unique_percent	%	-1
	interlength_udp_max_value_count_percent	%	-1
RTP inter timestamp				Difference between the currently packet rtp timestamp and the previous one
	rtp_inter_timestamp_num_zeros	[0, +inf)	-1
	rtp_inter_timestamp_std	[0, 2^64]	-1
	rtp_inter_timestamp_mean	[-2^32, 2^32]	-1
	rtp_interarrival_min	[-2^32, 2^32]	-1
	rtp_interarrival_max	[-2^32, 2^32]	-1
	rtp_interarrival_count	[-2^32, 2^32]	-1
	rtp_interarrival_kurtosis	(-inf, +inf)	–
	rtp_interarrival_skew	(-inf, +inf)	–
	rtp_interarrival_moment3	(-inf, +inf)	–
	rtp_interarrival_moment4	(-inf, +inf)	–
	rtp_interarrival_max_min_diff	(-inf, +inf)	–
	rtp_interarrival_max_min_R	[0.5, 1]	-1
	rtp_interarrival_min_max_R	[0, 0.5	-1
	rtp_interarrival_len_unique_percent	%	-1
	rtp_interarrival_max_value_count_percent	%	-1
	rtp_marker_sum_check	[0, +inf)	-1
	rtp_seq_num_packet_loss	%	-1
	rtp_csrc_csrc_agg	–	-1
Inter time sequence				Difference between the sequence number and rtp timestamp of the current packet
	inter_time_sequence_std	(-inf, +inf)	–
	inter_time_sequence_mean	(-inf, +inf)	–
	inter_time_sequence_max	(-inf, +inf)	–
	inter_time_sequence_count	(-inf, +inf)	–
	inter_time_sequence_kurtosis	(-inf, +inf)	–
	inter_time_sequence_skew	(-inf, +inf)	–
	inter_time_sequence_moment3	(-inf, +inf)	–
	inter_time_sequence_moment4	(-inf, +inf)	–
	inter_time_sequence_max_min_diff	(-inf, +inf)	–
	inter_time_sequence_max_min_R	[0.5, 1]	-1
	inter_time_sequence_min_max_R	[0, 0.5]	-1
	inter_time_sequence_len_unique_percent	%	-1
	inter_time_sequence_max_value_count_pe,rcent	%	-1
	inter_time_sequence_min	(-inf, +inf)	–

Hope you enjoyed this post and the paper itself! For more info you can always contact us by email.