TY - GEN
T1 - NoHateS
T2 - 30th IEEE International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2023
AU - Carhuancho-Bazan, Alessandro
AU - Nunez-Lazo, Sergio
AU - Ugarte, Willy
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Hate speech detection is a challenging task, especially in the context of real-time monitoring on the internet. Manual detection is both exhausting and impractical due to the high volume and frequency of online data. This paper proposes a system called NoHateS. This system is made of multiple components, the main one is BETO-CNN, a Transformers-based model trained on a Spanish corpus, which is designed to actually detect whether a text contains hate speech or not. The second component is developed to ensure accessibility. This includes an API to allow seamless integration of the model into various applications, and a Discord Bot developed for easy manipulation of the aforementioned API in order to help users detect hate speech in text channels. This paper also includes tests with imbalanced data and applies data augmentation in order to deal with it and make more robust models. The results demonstrate the effectiveness of NoHateS in detecting hate speech and provide recommendations for future research in this domain as it achieves 72.63% and 72.94% F1-score on the non-augmented and augmented dataset respectively.
AB - Hate speech detection is a challenging task, especially in the context of real-time monitoring on the internet. Manual detection is both exhausting and impractical due to the high volume and frequency of online data. This paper proposes a system called NoHateS. This system is made of multiple components, the main one is BETO-CNN, a Transformers-based model trained on a Spanish corpus, which is designed to actually detect whether a text contains hate speech or not. The second component is developed to ensure accessibility. This includes an API to allow seamless integration of the model into various applications, and a Discord Bot developed for easy manipulation of the aforementioned API in order to help users detect hate speech in text channels. This paper also includes tests with imbalanced data and applies data augmentation in order to deal with it and make more robust models. The results demonstrate the effectiveness of NoHateS in detecting hate speech and provide recommendations for future research in this domain as it achieves 72.63% and 72.94% F1-score on the non-augmented and augmented dataset respectively.
KW - BERT
KW - BETO
KW - Hate speech
KW - Transformer
UR - https://www.scopus.com/pages/publications/85179894711
U2 - 10.1109/INTERCON59652.2023.10326033
DO - 10.1109/INTERCON59652.2023.10326033
M3 - Contribución a la conferencia
AN - SCOPUS:85179894711
T3 - Proceedings of the 2023 IEEE 30th International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2023
BT - Proceedings of the 2023 IEEE 30th International Conference on Electronics, Electrical Engineering and Computing, INTERCON 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 November 2023 through 4 November 2023
ER -