Machine learning for sentiment analysis

PhD program in Computer and control engineering

Supervisors

Elena Baralis – elena.baralis@polito.it
Marco Mellia – marco.mellia@polito.it

Context of the research activity

Sentiment analysis is the process of converting unstructured data (typically text) to extract the attitude of the creator of the content in that specific context. This process helps converting “sentiments” into a machine‐understandable format, which in turn enables a series of possibilities, e.g., understand a crowd’s opinion about a specific topic, measure customer satisfaction, and offer special care for those individuals that are having a negative experience.
Despite the advances in the field, sentiment analysis still struggles in some aspects (e.g., sarcasm detection). The availability of further content types (e.g., audio, video) may significantly improve the performance of current techniques. Hence, a key issue in sentiment analysis will be the capability to design machine learning techniques capable to deal with heterogeneous contents and complex data relationships.
The research activity fits in the SmartData@PoliTo interdepartmental centre, that brings together competences from different fields, ranging from modelling to computer programming, from communications to statistics. The candidate will join this interdisciplinary team of experts and collaborate with them.

Objectives

The objective of the research activity is the definition of novel sentiment analysis approaches aiming at improving the detection performance by considering heterogeneous information sources.
The following steps (and milestones) are envisioned.
Data collection and exploration. Publicly available datasets will be initially considered as benchmarks. Subsequently, new data will be collected (e.g., by means of twitter APIs) with the aim of considering different data formats and types (e.g., images, emoji). Tools for explorative analysis will be exploited to characterize data and drive the following analysis tasks.
Sentiment analysis algorithms design and development. Novel algorithms designed for the specific data analysis problem will be designed. The algorithms will exploit the content derived from different information sources (e.g., by performing object detection in images) to extend the context provided by textual information. Graph representations will be considered to represent correlated (positively or negatively) concepts, in terms of sentiment expressed by the different users.
Deployment in real world applications. A variety of different application scenarios will be considered as targets for the proposed sentiment analysis techniques, e.g. recommender systems, reviews management, and creation of content tailored to a specific audience (in the marketing domain). Big data platforms will be considered as possible development frameworks, given the large data volumes to be considered.

Skills and competencies for the development of the activity

The candidate should have excellent programming skills, programming experience in the Hadoop/Spark ecosystem, good knowledge of machine learning algorithms.

Further information about the PhD program at Politecnico can be found here

Back to the list of PhD positions