Identifying implicitly abusive remarks about identity groups using a linguistically informed approach

We address the task of distinguishing implicitly abusive sentences on identity groups (“Muslims contaminate our planet”) from other group-related negative polar sentences (“Muslims despise terrorism”). Implicitly abusive language are utterances not conveyed by abusive words (e.g. “bimbo” or “scum”). So far, the detection of such utterances could not be properly addressed since existing datasets displaying a high degree of implicit abuse are fairly biased. Following the recently-proposed strategy to solve implicit abuse by separately addressing its different subtypes, we present a new focused and less biased dataset that consists of the subtype of atomic negative sentences about identity groups. For that task, we model components that each address one facet of such implicit abuse, i.e. depiction as perpetrators, aspectual classification and non-conformist views. The approach generalizes across different identity groups and languages.

Metadaten
Author:	Michael Wiegand ORCiD GND, Elisabeth Eder, Josef Ruppenhofer ORCiD GND
URN:	urn:nbn:de:bsz:mh39-112614
DOI:	https://doi.org/10.18653/v1/2022.naacl-main.410
ISBN:	978-1-955917-71-1
Parent Title (English):	Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. July 10-15, 2022.
Publisher:	Stroudsburg
Place of publication:	Association for Computational Linguistics
Editor:	Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Document Type:	Conference Proceeding
Language:	English
Year of first Publication:	2022
Date of Publication (online):	2022/10/07
Publishing Institution:	Leibniz-Institut für Deutsche Sprache (IDS)
Publicationstate:	Veröffentlichungsversion
Reviewstate:	Peer-Review
Tag:	abusive language; abusive remarks; identity groups
GND Keyword:	Beleidigung; Beschimpfung; Computerlinguistik; Datensatz
First Page:	5600
Last Page:	5612
DDC classes:	400 Sprache / 400 Sprache, Linguistik
Open Access?:	ja
Leibniz-Classification:	Sprache, Linguistik
Linguistics-Classification:	Computerlinguistik
Program areas:	P2: Mündliche Korpora
Licence (English):	Creative Commons - Attribution 4.0 International

Open Access