5th SmartData@PoliTO Workshop (Internal Only) – Checking where we are

The Fifth Workshop for the Interdepartmental Center SmartData@PoliTO will be held on September 26th and 27th, 2019 in Langhe (Piedmont).

Date: Thursday September 26th and Friday September 27th
Location: Hotel Barolo, Via Lomondo, 2, 12060 Barolo (CN)

Chairs: Daniele Apiletti & Martino Trevisan

Thursday, September 26th

Coffee break: 10:00 – 10:30

Session 1: 10:30 – 12:30

Marco Mellia: Introduction
PhD presentations:
  • Thomas Favale: Privacy compliant network monitoring probe
  • Flavio Giobergia: Mining Sensor Data for Predictive Maintenance in the Automotive Industry
  • Vittorio Mazzia: Improvement in land cover and crop classification based on CNN in combination with RNN
  • Marco Guerra:  Principled Homological Scaffold for the Brain Functional Connectome via Minimal Bases
  • Antonio Mastropietro and Alessandro De Gregorio: Interpretability of a Neural Network classifier with SHAP and Topological Data Analysis for psilocybin effect on human brain
  • Francesco Della Santa: Discontinuous Neural Networks
  • Ulderico Fugacci: Topology-Based Tools for Data Classification

Lunch: 13:00 – 14:00

Session 2: 14:00 – 15:30:

Pietro Michiardi: Approximate Bayesian Inference for Deep Learning

Drawing meaningful conclusions on complex real life phenomena and being able to predict the behaviour of systems of interest requires developing accurate and highly interpretable mathematical models whose parameters need to be estimated from observations. Deep learning techniques have become extremely popular to tackle such challenges in an effective way, but they do not offer satisfactory performance in applications where quantification of uncertainty is of primary interest. Bayesian Deep Learning techniques have been proposed to combine the representational power of deep learning with the ability to accurately quantify uncertainty thanks to their probabilistic treatment. While attractive from a theoretical standpoint, the application of Bayesian Deep Learning techniques poses huge computational challenges that arguably hinder their wide adoption.
In this talk, I will cover recent trends in practical and scalable Bayesian inference techniques, and discuss some of their benefits and challenges. We will focus on stochastic variational inference techniques, and conclude by illustrating their connection to stochastic optimisation. In particular, we will see that a simple, constant-rate stochastic gradient descent algorithm can be used as an approximate Bayesian posterior inference algorithm.

PhD presentations:
  • Francesco Ventura: Explaining Black-Box models for unstructured data
  • Mirko Pieropan: Expectation Propagation for the diluted bayesian perceptron classifier
Michela Meo: Greener network operation through machine learning

The talk focuses on the support that machine learning can provide to green  network operation. Green networks combine resource and energy management decisions; where resource management includes the use of sleep modes and resource on demand decisions and energy management involves the choice of which energy source to use, in a power supply system that integrates renewable energy generation into a traditional supply.  The case of a radio access network is made and the impacts of different traffic prediction models on the consumed energy mix and on QoS is evaluated. The results show that a widespread implementation of energy saving strategies without the support of ML would require a careful tuning that cannot be performed autonomously and that needs continuous updates to follow traffic pattern variations. On the contrary, ML approaches provide a versatile framework for the implementation of the desired trade-off that naturally adapts the network operation to the traffic characteristics typical of each area and to its evolution.

Degustazione: 16:00 – 17:00

Session 3: 17:30 – 19:00

SmartData collaborations with enterprises:
  • Dena Markudova: Activities with Tierra S.p.A
  • Tania Cerquitelli: Visualising high-resolution energy maps through the exploratory analysis of energy performance certificates
  • Luca Vassio: How to design an electric free floating car sharing service?
  • Daniela Renga: Transparently mining data from a medium-voltage distribution network: a prognostic-diagnostic analysis
  • Fabrizio Lamberti: Trasformazione digitale nel settore assicurativo: Casi di studio di Reale Group
  • Danilo Giordano: Anatomy of a Predictive Maintenance Pipeline: A Powertrain use case
PhD school experiences

Dinner: 20:00 – 21:30

Friday, September 27th

Breakfast: 8:00 – 9:00

Session 4: 9:00 – 10:30

Mauro Gasparini e Lidia Sacchetto: Model-based binary classification based on binary data

I will review model based classification, both with hard and soft assignment of the units, in the case of binary features (predictors). I will explore the properties of the ROC curve in this special case and provide some examples. This is joint work with my Ph.d. student Lidia Sacchetto.

PhD presentations:
  • Andrea Pasini: Geological Pore Clustering: a Semi-supervised Hierarchical Approach
  • Eliana Pastor: A Density-based Preprocessing Technique to Scale Out Clustering
  • Marilisa Montemurro: Silhouette score to assess biological sample dissimilarity
  • Moreno La Quatra: Using Regression Models to Pinpoint Relevant Content in Research Papers
  • Elena Daraio: Characterizing air-quality data through unsupervised analytics methods

Coffee break: 10:30 – 11:00

Session 5: 11:00 – 12:45

Sandra Di Rocco: Algebraic modelling and sampling
Many problems in science can be described by polynomial equations. The solution set of the corresponding polynomial system is referred to as an algebraic geometrical model for the problem. When the solution set consists of isolated points the model is easy to describe and even to visualise. For higher dimensional solution spaces, deeper and more sophisticated geometrical and numerical techniques are required.  Some ideas for algebraic sampling  and on how to estimate its density will be presented. The key challenge is to estimate the right density to recover the topological signature of the model.
PhD presentations:
  • Andrea Morichetta: Unsupervised learning for network traffic analysis
  • Michele Cocca: Carsharing usage correlation with weather and socio-demographic data
Alberto Pisoni: The paradigm shift of automotive in the IoT world. A practical example
Gianluca Setti: Improving Compress Sensing at the Edge via Adaptation and Machine Learning

Compressive Sensing is an acquisition technique which relies on the sparsity of the underlying signals to enable sampling below the classical Nyquist rate, with the promise of saving significant energy with respect to classical A/D conversion. Despite this promise, CS techniques are still only seldom considered in practical implementation of smart sensing nodes operating at the edge of the cloud, which need to be capable of acquiring, encoding, and early processing the information on the basis of a very tight energy budget. One of the motivations for this is the gap between general theoretical performance results of CS and practical optimized embodiments that can be achieved in practice.
The aim of this talk is to show that several options are viable and effective in optimizing CS-based acquisition. The first rely on adaptive CS, where acquisition sequences statistical properties are designed to maximize on average the energy acquired by every sample when a specific family of signal is acquired. This effectively reduce the minimum number of samples, and thus the energy necessary for acquisition, by about 50% with respect to classical CS. The second shows how a deep neural network based oracle can be trained and used to successfully estimate the signal support during the reconstruction phase, reducing it de-facto to a simple pseudo-inversion operation. The use of the oracle allows the definition of an encoding-decoding scheme which works also in presence of acquisition windows shorter than the minimum allowed by classic CS theory, thus further reducing the energy during sampling. As an additional feature, oracle-based recovery is able to self-assess, by indicating with remarkable accuracy chunks of signals that may have been reconstructed with a non-satisfactory quality, which may be proven very useful when reconstruction is performed in a gateway situated at the edge of the cloud.

Lunch: 13:00 – 14:00

Session 5: 14:00 – 16:00

Andrea Calimera: ConvNet on Tiny Cores

The promise of the IoT is to improve the quality of services using the information inferred from data collected across distributed platforms. The next generation of smart-objects will be able to distill such information at the edge, where data are generated, by-passing (or at least limiting) the access to the cloud. The success of this strategy involves the deployment of complex data-analytics algorithms, like deep neural models, on low-power devices. The objective of this talk is to introduce practical optimization techniques aimed at improving the energy efficiency of deep neural networks made run on tiny embedded computers.

PhD presentations:
  • Andrea Bordone Molini: Deep neural network for Super-resolution of Unregistered Multitemporal images
  • Giuseppe Attanasio: Quantitative cryptocurrency trading: exploring the use of machine learning techniques
  • Nicola Prette: Video Compression using Neural Networks
  • Sina Famouri: Towards robust training of Faster RCNN for breast mass detection and classification
Marco Mellia: Final Remarks

Coffee break: 16:00 – 16:30