eXplainable Artificial Intelligence techniques for Natural Language Processing tasks

PhD program in Computer and Controll Engineering

Supervisors

Tania Cerquitelli – tania.cerquitelli@polito.it
Claudio Canuto – elena.baralis@polito.it

Context of the research activity

Despite the high accuracy promised by state‐of‐the‐art deep learning models to address classification tasks, their applicability in real‐life settings is still limited due to their opaqueness, i.e. they behave as a black‐box to the end‐user. The eXplainable Artificial Intelligence (xAI) field of research is seeking for new solutions that try to fulfil the existing gap between accuracy and interpretability, encountering many obstacles. The explainability of complex machine learning models applied to domains such as structured data and image classification has been currently explored and scientific milestones have been reached. However, the Natural Language Processing (NLP) is still lacking robust and specialized solutions. The main research goal of this proposal is designing innovative xAI solutions that offer a level of transparency greater than existing methods tailoring to NLP. The vast majority of existing data algorithms are opaque – that is, the internal algorithmic mechanics are not transparent, in that, they produce output without making clear how they have done it. Innovative approaches will be devised to make the NLP algorithms human‐readable and usable by both analysts and end‐users to significantly increase explainability and user control in classification tasks.

Objectives

The research objectives address the following issues:

xAI solutions for NLP tasks the huge amount of data collected from people’s daily lives (e.g. web searches, social networks, e‐commerce) are textual data. Black‐box predictive model tailored to NLP tasks increases the risk of inheriting human prejudices, racism, gender discrimination and other forms of bias. As NLP algorithms increasingly support different aspects of our life, they may be misused and unknowingly support bias and discrimination, if they are opaque (i.e., the internal algorithmic mechanics are not transparent in that they produce output without making it clear how they have done so). Innovative xAI solutions, tailored to NLP tasks, will be designed to produce more credible and reliable information and services. They will play a key role in a large variety of application domains by making the results of the data analysis process and its models widely accessible.
Concept‐drift detection for xAI solutions. When dealing with large data collections or complex textual datasets, the model trained in the past may be no longer valid. The identification of concept‐drift can potentially become a key issue in the data analytics pipeline. Adaptive and interpretable strategies will be studied and tailored to NLP tasks to avoid the expensive and resource‐consuming procedure of model re‐training when not necessary, and to understand why and how data have changed over time.
Data and Knowledge visualization. Visualization techniques help humans to correctly interpret, interact, and exploit data and its value. Innovative visualization representations will be studied to enhance the interpretability of the internal algorithm mechanics. Keeping the user in the analytics loop by leveraging human visual perceptions on intermediate data analytics steps is an efficient and interpretable way to understand algorithms decisions.

Skills and competencies for the development of the activity

The successful candidate should have good skills in computer engineering, with previous experience in data science oriented activities. A master thesis work on these subjects, and previous publications in research topics related to the project are desirable.

Further information about the PhD program at Politecnico can be found here

Back to the list of PhD positions