Modelling cancer evolution through the development of artificial intelligence-based techniques on temporal and spatial molecular data
Cancer is an evolving entity and the evolutionary properties of each tumor are likely to play a critical role in shaping its natural behavior and how it responds to therapy. However, effective tools and metrics to measure and categorize tumors based on their evolutionary characteristics still must be identified. We plan to combine mathematical modelling and AI‐based approaches to develop a new generation of cancer classifiers based on tumor evolutionary properties and proxy data. The project will be developed in collaboration with the Department of Oncology at the University of Torino. The proposed research activity fits in the SmartData@PoliTo interdepartmental centre, that brings together competences from different fields, ranging from modelling to computer programming, from communications to statistics. The candidate will join this interdisciplinary team of experts and collaborate with them.
Machine Learning algorithms and their embedded implementation for service robotics applications in precision agriculture
Several studies have demonstrated the need to significantly increase the world’s food production by 2050. Technology could help the farmer, its adoption is limited because the farms usually do not have power, or Internet connectivity, and the farmers are typically not technology savvy. We are working towards an end‐to‐ end approach, from sensors to the cloud, to solve the problem. Our goal is to enable data‐driven precision farming. We believe that data, coupled with the farmer’s knowledge and intuition about his or her farm, can help increase farm productivity, and also help reduce costs. However, getting data from the farm is extremely difficult since there is often no power in the field, or Internet in the farms. As part of the PIC4SeR project, we are developing several unique solutions to solve these problems using low‐cost sensors, drones, rovers, vision analysis and machine learning algorithms. The research activity fits in the SmartData@PoliTo interdepartmental centre, that brings together competences from different fields, ranging from modelling to computer programming, from communications to statistics. The candidate will join this interdisciplinary team of experts and collaborate with them.
Cybersecurity is one of the biggest problem in the information society that is impacting all modern communication networks. More and more complicated threats are found on a daily basis, which make the complexity of identifying and designing countermeasures more and more difficult. Machine learning and Big Data offer scalable solutions to learn from labelled datasets and build models that can be used to detect attacks. Unfortunately, in the cybersecurity context, we lack the ability to obtain large datasets of labelled attacks, since threats continue to evolve over time. This call for novel solutions to face the problem. Recently, generative adversarial networks have been proposed as a means to generalize a sample labelled dataset and create artificially richer datasets. The involve two neural networks contesting with each other in a zero‐sum game framework, one that generates candidates while the second learn how to discriminate instances. The generative network’s training objective is to increase the error rate of the discriminative network (i.e., “fool” the discriminator network by producing novel synthesized instances that appear to have come from the true data distribution). The research activity fits in the SmartData@PoliTo interdepartmental centre, that brings together competences from different fields, ranging from modelling to computer programming, from communications to statistics. The candidate will join this interdisciplinary team of experts and collaborate with them.
Big data techniques for assessing the impact of web distribution strategies on performance in the hospitality industry
The objective of the research is twofold. First, the research will investigate the usage of big data techniques in order to gather data from the Internet about hotels’ visibility, reputation, pricing and distribution strategies on the main online channels . Second, the research will complement the usage of big data algorithms for analysing data gathered with econometric analyses with the aim of investigating how the Internet distribution strategies impact on operational and economic performance, on both a daily and a year basis. In this regard, a collaboration with some companies operating in the channel management can be envisaged to access their proprietary data on hotel’s pricing and distribution strategies in the online world.
This Ph.d. position will be devoted to theoretical aspects of reconstruction problems, with special emphasis in the adaptive TAP method for the analysis of Bayesian problems with non-linear prior information coming from real datasets. This will involve successfully modelling distributions of real data in some subdomain (e.g. natural and tomographic images), and the development of methods to solve approximately the resulting Bayesian problem.
The objective of the research activity is the definition of big data analytics approaches capable of extracting and managing knowledge of heterogeneous types (e.g., structured data, textual information, images).
The novelty of TDA (Topological Data Analysis) is that it studies the shape of topological spaces at the mesoscopic scale by going beyond the standard measures defined on data points’ pairs. This is done by moving from networks to simplicial complexes. The latter are obtained from elementary objects, called simplices, built from such simple polyhedral as points, line segments, triangles, tetrahedra, and their higher dimensional analogues glued together along their faces.
The candidate will focus his activity on studying the interplay between machine learning (but not only), computer simulations and statistical models: by analyzing using ML techniques the configuration space coming out from a knows mathematical/statistical model we will try to identify the relevant parameters and to refine/simplify the model. After this they will use the knowledge acquired to infer models form big data, whose configuration space approximate the given data and to use this simplified model to transform correlation in causation to make our information finally “actionable”. We will use also the developed framework to increase the population of high quality-low quantity dataset and to work on model reduction, assessment and validation in the area of FEM.
The objective of the research activity is the definition of big data analytics approaches to analyze IoT streams for a variety of applications (e.g., sensor data streams from instrumented cars).
The following steps (and milestones) are envisioned.
Data collection and exploration. The design of a framework to store relevant information in a data lake. Heterogeneous data streams encompassing custom proprietary data and publicly available data will be collected in a common data repository. Tools for explorative analysis will be exploited to characterize data and drive the following analysis tasks.
Big data algorithms design and development. State-of-the-art tools and novel algorithms designed for the specific data analysis problem will be defined (e.g., to predict component failures).
Knowledge/model interpretation. The understanding of a discovered behavior requires the interaction with domain experts, that will allow operational validation of the proposed approaches.
The goal of the research activities is to i) Design smart data collection crawlers that can automatically collect data from recommendation systems, i.e., designing smart policies to sample the humongous amount of data they potentially expose; ii) Model the data using graph based modeling, and design and test algorithms to automatically identify possible anomalies which reflect eventual pollution in the recommendation system; iii) Correlate the reputation of a business entity with its economic performance in the real world, and building models to predict how the performance will eventually improve based on a change in the reputation on the online recommendation systems.