My Past NLP Projects

I started to learn NLP-related stuff in mid-2018. And gradually start to some serious research. My focus mainly on Natural Language Generation (NLG).

Beginning: SemEval2019

By the end of 2018, members of the AI learning group Lead by Prof. Weifeng Su decide to take part in the SemEval 2019. To study and learning more about NLP.

Our group chose task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). Which the input text is users’ tweets and we need to classify it.

There are three subtasks. Task A needs to identify the offensive language. Task B needs to categorization the offensive type. And Task C needs to identify the offensive target.

We used the BERT model, which is state-of-the-art at that time for text classification. But our experiment limit to changing the preprocessing methods and some simple ensemble (not even proper) methods. But is still fun to learn such an advanced model and some NLP methods.

Our work was published as a paper in the NAACL 2019 Workshop: ACL Anthology

Final Year Project: LenAtten

We proposed a novel attention-based module that can be applied to the RNN-based sequence-to-sequence Model for Fixed Length Summarization Task. The proposed method achieved good results on CNN/Daily Mail and English Gigaword Dataset.

This work has been published at Findings of ACL 2021: ACL Anthology; Code.

Emotion-aware Multimodal Pre-training for Image-grounded Emotional Response Generation

In this work, we consider the natural situation that happens during a two-person doing conversation. Factors like facial expression, posture, and more will be considered except for the content expressed through spoken language. And usually, such non-verbal factors will convey much richer and more abstract information like emotions. Based on this nature, we proposed methods pre-training the language model to capture emotions from modals and incorporate the emotion into text generation for dialogue.

This work has been accepted at DASFAA 2022; the article can be accessed from Springer Link.

Author	Zhenghao Wu
Publish & Update Date	2021-05-07 - 2022-01-23
Tags	NLP Natural Language Processing Classification Machine Learning Offensive Detection Summarization Text Generation Controllable Text Generation Cross Modal Dialogue Transformer Pre-trained Language Model
Extra Materials	[ACL Anthology] BNU-HKBU UIC NLP Team 2 at SemEval-2019 Task 6: Detecting Offensive Language Using BERT model Check [ACL Anthology] LenAtten: An Effective Length Controlling Unit For Text Summarization Check [GitHub] LenAtten Check

Natural Language Processing

Beginning: SemEval2019

Final Year Project: LenAtten

Article Card

Related Posts

My Past NLP Projects

Natural Language Processing

Beginning: SemEval2019

Final Year Project: LenAtten

MSc Independent Project: Cross-modal Dialogue Pre-training

Article Card

Related Posts