Multi-task learning using Natural Language Processing to classify online content produced by young people in a mental health forum

Shujing Zhao, Gabriela Ferraro, Md Zakir Hossain, Brendan Loo Gee, Luis Salvador-Carulla

Background: Professional monitoring of users in an online forum is critical to ensure individuals who are at risk of psychological distress are supported. There is an increase need for trained moderators. Although, this could be costly for those who manages online communities. Automated text classification systems using Natural Language Processing (NLP) and single-task learning have shown to be promising solutions for moderators. However, the limited labelled datasets may restrict the performance of the machine learning model to only one task.

Aims: This paper aims to compare the performance of multi-task learning (MTL) and single-task learning for the classification of posts in an online youth mental health forum.

Methods: We developed a MTL framework based on a deep learning model, with sentiment analysis of tweets as an auxiliary task. The auxiliary task was used to classify tweets into two discrete classes: positive or negative.

Results: Multi-task learning (F-score = 0.39) was significantly superior to single-task learning (F-score = 0.34) across all classes. Furthermore, MTL (F-score = 0.44) outperformed single-task learning (F-score = 0.03) on posts that required urgent attention. It is worth noting that MTL (F-score = 0.38) outperformed single-task learning (F-score = 0.05) on red label (posts that required immediate response). There were no improvements between MTL and single-task learning for flagged posts.

Conclusions: The results indicate that MTL is an effective method for developing prediction models with limited data.