Widening our horizons!
Big Data technologies & Data Science applications
It is an open workshop where people from companies and research centers present their activities related to Big Data, Machine Learning and Data Science, to stimulate discussion on these topics.
At the end, the SmartData@PoliTO Center offers a light banquet to all participants.
Date: Thursday, February 28th – 14.00 / 19.00
Location: Aula Magna, Corso Duca degli Abruzzi 24, Politecnico di Torino
REGISTRATION is free and it is STRICTLY REQUIRED to access the event, seats are limited: please use the following link.
Program in brief
|Guido Saracco (Rector)||Opening|
|Marco Mellia (SmartData Coordinator)||Introduction|
|Enrico Busto (Addfor)||Why join the navy if you can be a Pirate?|
|Marco Fiore (CNR)||Urban Vibes and Rural Charms: Analysis of Geographic Diversity in Mobile Service Usage at National Scale|
|Tania Cerquitelli (POLITO)||Visualizing high-resolution exploratory energy maps by analyzing energy-performance certificates|
|Marco Aldinucci (UNITO)||The evolution of high-performance systems: from HPC to Big Data to Deep Learning|
|Antonio Vetrò (NEXA & FULL)||Fairness in automated decisions: motivations and preliminary researches|
|Carlo Baldassi (Bocconi)||Julia: a programming language for data science|
|Riccardo Loti (Tierra Telematics)||Operating and evolving a distributed IoT platform|
First Session: 14:00 – 16:00
Opening by prof. GUIDO SARACCO, Rector of Politecnico di Torino
Introduction (slides) by MARCO MELLIA, coordinator of the SmartData@PoliTO center
After 1 year since the SmartData@PoliTO started, we are now more then 50 people, working on both fundamentals and applications of Big Data, Machine Learning and Data Science. We have started several collaborations with companies, designed a new data center, organised a physical space, participated into projects. What’s next?
Marco Mellia graduated from the Politecnico di Torino with Ph.D. in Electronic and Telecommunication Engineering in 2001. He co-authored over 200 papers published in international journals and presented in leading international conferences, and hold 9 patents. He was awarded the IRTF Applied Networking Research Prize in 2013, and the best paper award at ACM CoNEXT 2013, IEEE TRAC 2015, IEEE ICDCS 2015 and ITC 2015. He participated in the program committees of several conferences including ACM SIGCOMM, ACM CoNEXT, ACM IMC, IEEE Infocom, IEEE Globecom and IEEE ICC. I am Area Editor of ACM CCR, and part of the Editorial Board of IEEE/ACM Transactions on Networking and of IEEE Transactions of Networks and Service Management. He is part of the steering committee of ACM CoNEXT, and of IEEE/IFIP TMA. He is the co-founder of Ermes Cyber Security, a startup providing B2C solutions to protect companies from web-tracking. Now he is coordinating the SmartData@PoliTO interdepartmental center in Polictecnico, where more than 30 students, professors, and researchers are working on big data and data science applications.
Enrico Busto (Addfor) – Why join the navy if you can be a Pirate? (slides)
How did we create Addfor, the first Italian company that develops Artificial Intelligence applications?
How did the company develop? What were the applications: from the first systems based on Recurrent Neural Networks to the most recent activities on Reinforcement Learning.
Graduated in Aeronautical Engineering at the Politecnico di Torino and at the Imperial College – London. He has been working on Cognitive Computing since 1995 when he developed an optimization algorithm for STOL/VTOL vehicles. After being employed for eleven years in MathWorks as Senior Application Engineer he co-founded Addfor S.p.A. Today he leads all the activities related to Cognitive Computing, Numerical Simulation and 3D Rendering.
We investigate spatial patterns in mobile service consumption that emerge at national scale. Our investigation focuses on a representative case study, i.e., France, where we find that: (i) the demand for popular mobile services is fairly uniform across the whole country, and only a reduced set of peculiar services (mainly operating system updates and long-lived video streaming) yields geographic diversity; (ii) even for such distinguishing services, the spatial heterogeneity of demands is limited, and a small set of consumption behaviors is sufficient to characterize most of the mobile service usage across the country; (iii) the spatial distribution of these behaviors correlates well with the urbanization level, ultimately suggesting that the adoption of geographically-diverse mobile applications is linked to a dichotomy of cities and rural areas. We derive our results through the analysis of substantial measurement data collected by a major mobile network operator, leveraging an approach rooted in information theory that can be readily applied to other scenarios.
Marco Fiore is a permanent researcher at CNR-IEIIT, Italy, EU Marie Curie fellow, and Royal Society visiting research fellow. Marco received MSc degrees from the University of Illinois at Chicago, IL, USA, and Politecnico of Torino, Italy, a PhD degree from Politecnico di Torino, Italy, and a Habilitation à Diriger des Recherches (HDR) from Université de Lyon, France. He held previous positions as Maître de Conférences (Associate Professor) at Institut National des Sciences Appliquées (INSA) de Lyon, France, Associate Researcher at Inria, France, visiting research fellow at Rice University, TX, USA, Universitat Politècnica de Catalunya (UPC), Spain, and University College London (UCL), UK. Marco is a recipient of the French national Scientific Excellence Award (PES), EU Marie Curie Career Reintegration Grant, Royal Society International Exchange Fellowship, Data Transparency Lab grant, and a Finalist at the Telecom Italia Big Data Challenge. He is a senior member of IEEE, and a member of ACM.
Tania Cerquitelli (POLITO) – Visualizing high-resolution exploratory energy maps by analyzing energy-performance certificates (slides)
In the talk I will present the current research project with EDISON SpA on the topic “City building mapping: using public data to find energy efficiency of buildings”, that focuses on the harvesting of open data for the characterization of the energy efficiency and consumption of residential buildings. We are mainly focusing on extracting actionable and human-readable knowledge items from a real and open collection of Energy Performance Certificates (EPCs), providing interesting information on the energy performance, thermo-physical and geometrical properties of buildings in the city of Turin. A two-level data analytics methodology based on exploratory algorithms and knowledge visualization has been designed. Specifically, an unsupervised algorithm divides EPCs into homogeneous groups of buildings with similar thermo-physical characteristics. Each group is locally characterized through interesting patterns. Then, extracted knowledge items are visualized and can be navigated on high-resolution geo-located maps to summarize the main relationships among variables affecting the energy efficiency of buildings at different spatial granularity levels. These high-resolution energy maps allow different stakeholders to quickly identify potential sites and start assessing their energy efficiency. Differently from the state-of-the-art algorithms, the proposed methodology visualizes high-resolution energy maps empowered by the clustering step which (a) jointly considers (in each cluster) the effect of multiple variables affecting the energy efficiency of buildings, (b) graphically presents the information by means of powerful abstractions, and (c) provides a zooming capability to explore data both at coarse and high-detailed levels.
Tania Cerquitelli has been an Associate Professor at the Department of Control and Computer Engineering (DAUIN) of the Politecnico di Torino, Italy, since March 2018. Her research interests include self-learning methodologies, transparent data anaytics algorithms providing human readable data models, the design of innovative algorithms to perform large-scale data mining, novel and efficient data mining techniques for sensor readings, algorithms to extract high-level abstractions of the mined knowledge (e.g., generalized patterns). Tania has been involved in many European and Italian research projects addressing different topics (e.g., energy efficiency, network traffic analysis, Internet platform, Industry 4.0) in the data mining and machine learning research area. Tania has been the professor of courses in Introduction to Databases and Business Intelligence for Big Data courses at the Politecnico di Torino since 2011. She is the research supervisor of many graduates and PhD students. She got the master degree in Computer Engineering (in 2003) and the PhD degree (in 2007) from the Politecnico di Torino, Italy, and the master degree in Computer Science (in 2003) from the Universidad De Las Américas Puebla.
Second Session: 16:20 – 18:00
Marco Aldinucci (UNITO) – The evolution of high-performance systems: from HPC to Big Data to Deep Learning
Computer science evolves through successive abstractions. Today, after 30 years of lethargy, high-performance computing (HPC) is extending beyond its traditional fields of application. For years HPC systems have been feeding with differential equations; the ability to calculate many mathematical operations per second (FLOPS) was the key to solving ever larger problems and to find ever more precise solutions. The explosion of data resulting from digital transformation has shifted the demand for high performance from traditional applications (equations, simulations, etc.) to methods for the analysis of large amounts of data (BigData, Deep Learning, etc). Under this impulse, the programming and use models of HPC systems are evolving towards much more abstract models, able to satisfy different application needs and to simplify the development of new applications. The challenges for designers are renewed: from FLOPS to the efficient management of data in memory; from mathematics in double precision to that in small but efficient precision for deep neural networks. A blow of life for high-performance systems researchers: experimenting with new workloads, platforms, programming models, provisioning models. To meet these challenges, the University of Turin and Polytechnic University of Turin have joined forces to create a federated competence centre on High-Performance Computing (HPC), Artificial Intelligence (AI) and Big Data Analytics (BDA). A centre capable to collaborate with entrepreneurs to boost their ability to innovate on data-driven technologies and applications.
Marco Aldinucci is an associate professor at Computer Science Department of the University of Torino (UNITO) since 2014. Previously, he has been postdoc at University of Pisa, researcher at Italian National Research Agency (ISTI-CNR), and University of Torino. He is the author of over a hundred papers in international journals and conference proceeding. He has been participating in over 20 national and international research projects concerning parallel and autonomic computing. He is the recipient of the HPC Advisory Council University Award 2011, the NVidia Research award 2013, the IBM Faculty Award 2015. He is the P.I. of the parallel computing group alpha@UNITO, the director of the “data-centric computing” laboratory at ICxT@UNITO innovation centre, and vice-president of the C3S@UNITO competency centre. From Nov. 2018, he is a member of the Governing Board of the EuroHPC Joint Undertake. He co-designed, together with Massimo Torquati, the FastFlow programming framework and several other programming frameworks and libraries for parallel computing. His research is focused on parallel and distributed computing.
Antonio Vetrò (NEXA & FULL) – Fairness in automated decisions: motivations and preliminary researches (slides)
Automated decision systems and recommenders (e.g., ranging from automated resume screening to credit score systems to criminal justice support systems) are profoundly changing our lives, substituting experts in an increasing number of decisions and fields. They influence our vision of the world and mediate our interactions with it. In such a context, an increasing number of scientific studies and journalistic investigations has shown that such automated decision systems may have discriminating behavior and amplify inequalities in society: systems that are “evidence-based” or “data-driven” by no means ensure that they will lead to accurate, reliable, or fair decisions. In such a context, scientific communities are devoting considerable effort in understanding whether it is possible to achieve fairness in these systems, and how to measure it. In this talk we will have an overview of the problem and we present preliminary research approaches and results.
Antonio Vetrò is post-doctoral research fellow at the Nexa Center for Internet & Society and at the Future Urban Legacy Lab at Politecnico di Torino. He is also member of the UNI Technical Commission on Software Enginering. Currently, Antonio is conducting research on how to detect and mitigate potential discriminations deriving from biases in the data and in the algorithms of decision systems. Antonio is specialized in empirical methodologies and statistical analyses, applying such an empirical epistemological approach to study the impact of technology on society.
Carlo Baldassi (Bocconi) – Julia: a programming language for data science (slides)
Julia is a young programming language particularly well-suited to technical computing. Its main strengths rely in being a very high-level language, comparable to Python or Matlab for example, with performances on par with low-level languages like C or Fortran. It is thereby very apt to all sorts of “number-crunching” applications. Moreover, the general philosophy of the language (and in particular its sophisticated type system) is geared towards allowing to obtain excellent performance from very generic and reusable code, which greatly helps the inter-compatibility of disparate software packages and user-code. I will discuss these basic aspects of the language and in particular how it proves effective in data science applications, with examples.
Carlo Baldassi is an Assistant Professor at the Department of Decision Sciences at Bocconi University, Milan, where he teaches Computer Science courses. His background is in Theoretical Physics, computational biology and computational neuroscience, and his current research activities focus on the study of non-convex optimization and inference problems and the properties of state-of-the-art machine learning schemes. He is an early adopter of Julia and has been a very active early contributor in the initial phases of the development of the language, and has written and maintains several Julia packages.
Riccardo Loti (Tierra Telematics) – Operating and evolving a distributed IoT platform (slides)
Tierra Telematics provides telematics services to Original Equipment Manufacturing (OEM) and After Market (AM) customers.
We describe how a telematics solution is designed, built, and deployed, starting from the ground up (hardware design, cost analysis, firmware development), to the software layers (database, cloud architecture, application layers) to the data analytics (business intelligence, data exploration). We also describe the growing need for intelligent data analytics, and the challenges involved.
Riccardo Loti is the head of Applied Research in Tierra S.p.A.. Coming from an academic background, he got a PhD in Computer Science in 2014 from the Università degli Studi di Torino and the Universiteé de Nice-Sophia Antipolis. He got a background in Networking and Artificial Intelligence, an experience working abroad and in different fields, and a curiosity in everything interesting.
18:00 – 19:00 Light banquet