NoHateS: A Transformers-based Approach for Real-Time Hate Speech Detection in Spanish

Alessandro Carhuancho-Bazan, Sergio Nunez-Lazo, Willy Ugarte

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

Resumen

Hate speech detection is a challenging task, especially in the context of real-time monitoring on the internet. Manual detection is both exhausting and impractical due to the high volume and frequency of online data. This paper proposes a system called NoHateS. This system is made of multiple components, the main one is BETO-CNN, a Transformers-based model trained on a Spanish corpus, which is designed to actually detect whether a text contains hate speech or not. The second component is developed to ensure accessibility. This includes an API to allow seamless integration of the model into various applications, and a Discord Bot developed for easy manipulation of the aforementioned API in order to help users detect hate speech in text channels. This paper also includes tests with imbalanced data and applies data augmentation in order to deal with it and make more robust models. The results demonstrate the effectiveness of NoHateS in detecting hate speech and provide recommendations for future research in this domain as it achieves 72.63% and 72.94% F1-score on the non-augmented and augmented dataset respectively.

Idioma originalInglés
Título de la publicación alojadaProceedings of the 2023 IEEE 30th International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2023
EditorialInstitute of Electrical and Electronics Engineers Inc.
ISBN (versión digital)9798350315578
DOI
EstadoPublicada - 2023
Evento30th IEEE International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2023 - Lima, Perú
Duración: 2 nov. 20234 nov. 2023

Serie de la publicación

NombreProceedings of the 2023 IEEE 30th International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2023

Conferencia

Conferencia30th IEEE International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2023
País/TerritorioPerú
CiudadLima
Período2/11/234/11/23

Huella

Profundice en los temas de investigación de 'NoHateS: A Transformers-based Approach for Real-Time Hate Speech Detection in Spanish'. En conjunto forman una huella única.

Citar esto