6th SmartData@Polito Workshop

The Sixth Workshop for the Interdepartmental Center SmartData@PoliTO will be held on January 30th and 31st, 2020 in Torino at Castello del Valentino.

Date: Thursday, January 30th and Friday, January 31st.

Location: Castello del Valentino, Viale Mattioli, 39, 10125 Torino (TO).

Castello del Valentino

Thursday, January 30th

Morning Session: 9:00 – 12:40

Physics-Informed AI

Chair: Andrea Pagnani

9:00 – 9:50 Luca Biferale (Dipartimento di Fisica Università Roma Tor Vergata)
Assimilation & classification of turbulence data: Physics-informed and Machine Learning tools

Abstract: We provide a side-by-side comparison of modern techniques to
assimilate, classify and control  turbulent data by using equation-based
tools, as Nudging, or big-data Machine Learning approaches. We discuss
applications to rotating and fully developed isotropic turbulence. Both
Eulerian and Lagrangian cases will be discussed, including optimal
navigation problems in complex and chaotic flows.

Contributed talks
  • Guido Uguzzoni (Politecnico di Torino) – 9:50 – 10:10
    Prediction of protein fitness landscape from screening experiments
    Abstract: The recent technological advances underlying the screening of large combinatorial libraries of protein variants deepen our understanding of adaptive protein evolution and boost its applications in protein design. Nevertheless, the large number of possible genotypes imposes a critical limit for the fully experimental approach and requires suitable computational methods for data analysis, the prediction of mutational effects and the generation of optimized sequences. I describe a computational method that, trained on sequencing samples of a screening experiment, provides an accurate model of the genotype-fitness relationship.

  • Indaco Biazzo (Politecnico di Torino) – 10:10 – 10:30
    Learning probability distributions of models with autoregressive neural networks
    Abstract: The autoregressive neural networks (a class of generative autoencoder) have been demonstrated, in recent years, to have very good performance to learn from data. In 2019 new results showed the ability of these neural networks to learn probability distributions describing theoretical models. In the seminar, we present theoretical results and a practical application of these approaches to inference problems in the spreading of epidemics in contact networks.

Coffee Break 10:40 – 11:10

Higher Methods in Data Science and ML

Chair: Francesco Vaccarino

11:10 – 12:00 Théo Lacombe – INRIA Saclay
PersLay : a Neural Network for persistence diagrams and related topics.

Abstract: Persistence diagrams, the most common descriptors of Topological Data Analysis, encode topological properties of data and have found applications in different applications of data science. However, since the space of persistence diagrams does not have a linear structure, they end up being difficult inputs for most machine learning techniques. To address this concern, several vectorization methods have been put forward that embed persistence diagrams into vector spaces. In this talk, I will present different approaches to use diagrams in learning pipelines, notably PersLay, a neural network architecture we designed to encompass most of the vectorization techniques used in the literature.

Contributed talks
  • Iacopo Iacopini – Queen Mary University of London – 12:00 – 12:20
    Social contagion models beyond pairwise interactions
    Abstract: Complex networks have been successfully used to describe the spread of diseases in populations of interacting individuals. Conversely, pairwise interactions are often not enough to characterize social contagion processes such as opinion formation or the adoption of novelties, where complex mechanisms of influence and reinforcement are at work. I this talk, I will introduce a higher-order model of social contagion in which a social system is represented by a simplicial complex and contagion can occur through interactions in groups of different sizes. Numerical simulations of the model on both empirical and synthetic simplicial complexes highlight the emergence of novel phenomena such as a discontinuous transition induced by higher-order interactions. I will show analytically that the transition is discontinuous and that a bistable region appears where healthy and endemic states co-exist. These results help explain why critical masses are required to initiate social changes and contribute to the understanding of higher-order interactions in complex systems.

  • Ulderico Fugacci – SmartData@PoliTO – 12:20 – 12:40
    On the encoding of large, high-dimensional and unorganized datasets
    Abstract: Topological Data Analysis (TDA) aims at providing mathematical tools for the extraction of the core information from large, high-dimensional and unorganized datasets. A fundamental but often neglected aspect in the TDA’s pipeline is related to the design and the implementation of compact and efficient structures for encoding such datasets typically represented as simplicial complexes. In this talk, we address this topic focusing on the advantages guaranteed by the use of top-based representations: a specific class of data structures for arbitrary simplicial complexes explicitly storing just a fraction of the entities which the complex consists of.

Lunch: 13:00 – 14:30

Afternoon Session

Big Data and Online Privacy

Chair: Marco Mellia

14:30 – 15:20 Nikolaos Laoutaris – IMDEA Networks – Madrid
Measuring Online Behavioural Advertising and Other Adventures in Data Protection & Data Economics 

Abstract: In this talk I will present our results on detecting behavioural targeting in online advertising and will cover the two families of methods that we have developed for this problem, based on 1) content-based analysis for identifying correlations between visited web-pages and obtained ads, and 2) frequency-based analysis using only impression counts to detect ads that follow a user across domains. I’ll also cover my initial work on establishing transparency around online price discrimination, as well as how this work lead to the creation of the Data Transparency Lab while at Telefonica. I will complete my talk by introducing my new line of work around Human-Centric Data Economies and the related research group I am building at IMDEA Networks.

Contributed talks
  • Martino Trevisan – SmartData@PoliTO – 15:20 – 15:40
    Web Privacy in the Age of Big Data
    Abstract: Today we live in a alway-connected world. To protect our privacy, end-to-end encryption is advertised as the means to avoid intruders to listen and modify the content we access online. In this talk, I will present a study on the visibility that even encrypted communications allow. I will present how machine learning-based attacks can retrieve sensitive information about web usage of people, allowing, for instance, to re-identify them and still understand their interests.
  • Daniele Canavese – Politecnico di Torino – 15:40 – 16:00
    Encryption-agnostic traffic classification via machine-learning techniques
    Abstract: DPI (Deep Packet Inspection) is a family of techniques used to detect a wide array of malicious traffic by inspecting the packet payloads. However, the increasing usage of encrypted channels and adoption of privacy regulations (e.g. GDPR) is hindering the applicability of such techniques. In this context, the TorSec group is actively performing research in identifying malicious traffic connections by solely analyzing the packet headers without looking at the payloads. We make use of state-of-the-art machine-learning and deep learning techniques to recognize anomalous patterns in the traffic (e.g. crypto-jacking channels and botnets)

Coffee Break: 16:00 – 16:30

  • Carlos Henrique Gomes Ferreira – Universidade Federal de Mina Gerais – 16:30 – 16:50
    How following election debate in Instagram
    Abstract: Online social networks are today the means for online debate, often very crude, with fake news and hate speech invading them. In this talk, I will present our study on Instagram. After collecting all posts, comments and reactions to Instagram top influencers in Italy and Brazil, I define a methodology to identify unexpected coordinate efforts of commenters that animate the debate after an influencer’s post. By studying the period before – during – after political elections, my methodology allows to highlight interesting phenomena when applied to popular profiles of politicians on Instagram.

Social and Economics of Digital Transformation

17:00 – 17:30 Danilo Pesce
The digital transformation of search and recombination mechanisms

Abstract: In this study, we develop a systematic integrative framework that predicts the likely scope of search and recombination mechanisms vis-à-vis digitalization. Our framework predicts that, depending on the relative balance of the changes enacted by digitalization, these might lead firms to more incremental innovation in core or peripheral components, or to digital business innovation, or to no innovation at all.

Social Dinner – 20:00

The social dinner will be held at Porto di Savona

Friday, January 31st

Torino Wireless: 9:00 – 13:00

[Evento in Italiano]

Evento “HPC4AI – Intelligenza Artificiale e Big Data: opportunità e servizi al Politecnico”

Presentazione del Centro di Competenza Calcolo ad Alte Prestazioni e Intelligenza Artificiale HPC4AI@PoliTo, parte dell’infrastruttura congiunta Politecnico di Torino-Università di Torino HPC4AI, cofinanziata dalla Regione Piemonte con fondi europei FESR.

Si puó assistere all’evento anche in aula 8V via streaming

Una giornata per scoprire le competenze e le infrastrutture che l’Ateneo mette a disposizione delle imprese in ambito di Intelligenza Artificiale, Big Data Analytics, Machine Learning e High Performance Computing (HPC). Una occasione per scoprire anche le opportunità di finanziamento attraverso il bando VIR per Picole Medie Imprese Piemontesi.

Dettagli nella pagina dell’evento

Lunch: 12:20 – 14:00

Afternoon Session: 14:00 – 16:10

Industrial data analytics

Chair: Tania Cerquitelli

Contributed talk
  • Davide Tricarico (Technology System Engineer at GM Global Propulsion Systems S.r.l) – 14:00 – 14:30
    Automotive innovation: Data analytics and its challenges
    Bio: Davide Tricarico graduated in Computer Engineering from the Polytechnic of Turin in 2010. Since then he has held various positions in software development at General Motors in the Engine Controls department, now he is involved in Machine Learning application projects in the innovation area.

14:30 – 15:10 Massimo Ippolito (Head of Digital Innovation & Infrastructures at COMAU S.p.A)
Predictive analytics in Manufacturing

Bio: Massimo Ippolito currently holds the position of Head of Digital Innovation & Infrastructures at Comau, where he has been working since 2012. He is also member from 2015 of the Board of Directors of the European Factories of the Future Research Association (EFFRA). Ippolito holds a Ph.D. in “Industrial production engineering” from Parma University and a M.Sc. in “Computer science” at the University of Milan. He has extensive experience in methodologies and tools for product and production system design. Since 2000, he has been involved in various international research projects in the product design and manufacturing area, ranging from methodologies for product design for manufacturing to process design for energy efficiency. From 2007 to 2012, within Centro Ricerche Fiat, he was responsible of an innovation research program related to manufacturing topics.

Contributed talks
  • Andrea Virgilio (Head of Coffee Machine Prod. & Tech. Serv. at Lavazza S.p.A.) – 15:10 – 15:40
    Digitalizing the pleasure of a good espresso Lavazza coffee.

    Bio: Andrea Vincenzo is currently Coffee Machine Prod. & Tech. Service Director at Lavazza. He graduated in Electronic Engineering at Università di Palermo, Italy, in 2006. Prior to joining Lavazza, he worked as Quality Manager in FCA, as Plant Manager in Guala Closures, and as Plant Director in Luxottica spending also 1 year in China.
  • Elena Liore (Data science analyst, Accenture S.p.A.) – 15:40 – 16:10
    Data-driven estimation of heavy trucks’ value at buy-back
    Bio: Elena Liore has been working at Accenture S.p.A. since March 2019: from March to October as intern and from November as a data science analyst. Her main activities concern business intelligence for the Big Data, data analytics and data visualization in the field of Mobility X.0.She got the degree in Physical Engineering (in 2017) and the master degree in Mathematical Engineering (in 2019); she took part in the Erasmus program at TU/e, the Netherlands, in the Data Science course.