Embeddings-based techniques for Media Monitoring Applications
Publications:
- Jaya Caporusso, Damar Hoogland, Mojca Brglez, Boshko Koloski, Matthew Purver, and Senja Pollak. A Computational Analysis of the Dehumanisation of Migrants from Syria and Ukraine in Slovene News Media.
- Nikola Ivačič, Andraž Pelicon, Boshko Koloski, Senja Pollak, and Matthew Purver. News sentiment analysis datasets for Serbian, Bosnian, Macedonian, Albanian and Estonian SADEmma 1.0.
- Nikola Ivačič, Matthew Purver, Fabienne Lind, Senja Pollak, Hajo Boomgaarden, and Veronika Bajt. Comparing News Framing of Migration Crises using Zero-Shot Classification.
- Matej Klemen, Aleš Žagar, Jaka Čibej, and Marko Robnik-Šikonja. SI-NLI: A Slovene Natural Language Inference Dataset and Its Evaluation.
- Nikola Ljubešić and Taja Kuzman. CLASSLA-web: Comparable Web Corpora of South Slavic Languages Enriched with Linguistic and Genre Annotation.
- Michal Mochtak, Peter Rupnik and Nikola Ljubešić. The ParlaSent Multilingual Training Dataset for Sentiment Identification in Parliamentary Proceedings.
- Jakub Piskorski, Nicolas Stefanovitch, Senja Pollak, Zoran Fijavž, Ana Zwitter Vitez, Giovanni Da San Martino, et al. Overview of the CLEF-2024 CheckThat! Lab Task 3 on persuasion techniques.
- Aleš Žagar, Matej Klemen, Iztok Kosem, and Marko Robnik-Šikonja. SENTA: Sentence Simplification System for Slovene.
HuggingFace:
We organized SLaLaM 2023, the first Slovenian workshop on Large Language Models techniques and applications. The proceedings are available here:
CAPORUSSO, Jaya, LAVRAČ, Nada (eds.) (2023). Proceedings of SLaLaM 2023, 1st Slovenian Workshop on Large Language Models: Techniques and Applications. Bernardin, Slovenia.
Project duration: from 1. 10. 2023 to 30. 9. 2026
Fundings: This work was supported by the Slovenian Research and Innovation Agency research project Embeddings-based techniques for Media Monitoring Applications (L2-50070, co-funded by the Kliping d.o.o. agency).