Papers:
- B. Evkoski, A. Pelicon, I. Mozetič, N. Ljubešić, P. Kralj Novak. Retweet communities reveal the main sources of hate speech. PloS one 17 (3), e0265602, DOI, 2022.
- B. Evkoski, N. Ljubešić, A. Pelicon, I. Mozetič, P. Kralj Novak. Evolution of topics and hate speech in retweet network communities, Applied Network Science 6: 96, DOI, 2021.
- M. Cinelli, A. Pelicon, I. Mozetič, W. Quattrociocchi, P. Kralj Novak, F. Zollo. Dynamics of online hate and misinformation, Scientific Reports 11: 22083, DOI, arXiv, 2021.
- B. Evkoski, I. Mozetič, N. Ljubešić, P. Kralj Novak. Community evolution in retweet networks, PLoS ONE 16(9): e0256175, 2021.
- N. Ljubešić, D. Lauc. BERTić: A Transformer Model for Bosnian, Croatian, Serbian and Montenegrin. Proc. 8th Workshop on Balto-Slavic Natural Language Processing, pp. 37-42, ACL, 2021.
- N. Ljubešić, I. Markov, D. Fišer, W. Daelemans. The LiLaH Emotion Lexicon of Croatian, Dutch and Slovene. Proceedings of the Third Workshop on Computational Modeling of People’s Opinions, Personality, and Emotion’s in Social Media, pp. 153-157, ACL, 2020.
- M. Robnik-Šikonja, K. Reba, I. Mozetič. Cross-lingual transfer of sentiment classifiers, Slovenščina 2.0 9(1): 1-25, 2021.
- F. Baider. Accountability Issues, Online Covert Hate Speech, and the Efficacy of Counter‐Speech. Politics and Governance 11.2 (2023).
Models:
- Slovenian Toxicity Target: https://huggingface.co/IMSyPP/hate_speech_targets_slo
- Dutch Toxicity Target: https://huggingface.co/IMSyPP/hate_speech_targets_nl
- Hate Speech Classifier for Social Media Content in English Language https://huggingface.co/IMSyPP/hate_speech_en
- Hate Speech Classifier for Social Media Content in Italian Language https://huggingface.co/IMSyPP/hate_speech_it
- Hate Speech Classifier for Social Media Content in Dutch https://huggingface.co/IMSyPP/hate_speech_nl
- Hate Speech Classifier for Social Media Content in Slovenian Language https://huggingface.co/IMSyPP/hate_speech_slo
Data:
- Slovenian Twitter hate speech dataset IMSyPP-sl http://hdl.handle.net/11356/1398
- Slovenian Twitter dataset 2018-2020 1.0 http://hdl.handle.net/11356/1423
- English YouTube Hate Speech Corpus http://hdl.handle.net/11356/1454
- Italian YouTube Hate Speech Corpus http://hdl.handle.net/11356/1450
- Dutch Social Media Dataset https://github.com/textgain/IMSyPP-DATA
Code:
The code used to implement the Ensemble Louvain algorithm is available at the Github repository at https://github.com/boevkoski/ensemble-louvain.git
Deliverables:
- IMSyPP D2.1: Multilingual Hate Speech Database
- IMSyPP D4.3: Journalism Observatory 1
- IMSyPP D4.4: Journalism Observatory 2