News Release

A new and simple approach extending data expansion-based implicit discourse relation classification

Peer-Reviewed Publication

Higher Education Press

The processing flow of the proposed method

image: 

The processing flow of the proposed method

view more 

Credit: Wei SONG, Hongfei HAN, Xu HAN, Miaomiao CHENG, Jiefu GONG, Shijin WANG, Ting LIU

Discourse relation classification is a fundamental task for discourse analysis, which is essential for understanding the structure and connection of texts. Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is the most challenging in discourse relation classification because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data. A promising way is to expand the training data for implicit discourse relations based on easy-to-collect explicit discourse relations. However, the expanded data often involves noise in both the argument pair selection and discourse relation sense assignment process, leading to limited improvements.

To solve the problems, a research team led by Wei Song published their new research on 15 August 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

The team proposed a novel method for explicit data expansion to address the above challenges. In order to obtain suitable argument pairs, the argument pair type classification (APTC) task is proposed. APTC is a classifier that can distinguish between explicit and implicit argument pairs, and select only those explicit argument pairs that are similar to natural implicit argument pairs for data expansion so our method can filter out noisy and unsuitable argument pairs for data expansion. To annotate the senses of expanded argument pairs, a simple label-smoothing strategy is proposed. Instead of assigning a single dominant sense to a discourse connective, a smoothed sense is derived based on the distribution of each sense for that discourse connective. This way can reduce the impact of noisy sense labels that may not match the actual relation between the arguments.

Despite its simplicity, the evaluation results on PDTB 2.0 and PDTB 3.0 demonstrate the effectiveness of the proposed method. It can consistently lead to improvements compared with previous data expansion methods and obtain competitive performance to the state-of-the-art models across datasets and on both the top-level class and the second-level type senses. The discriminative explicit argument pair selection and the label-smoothing strategy complement and depend on each other to achieve the best performance. The results and analysis confirm that the proposed method extends the data expansion-based implicit discourse relation classification.
DOI: 10.1007/s11704-023-3058-2
 


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.