Results of the PolEval 2019 Shared Task 6 : first dataset and Open Shared Task for automatic cyberbullying detection in Polish Twitter

2019
book section
conference proceedings
dc.abstract.enIn this paper we describe the first dataset for the Polish language containing annotations of harmful and toxic language. The dataset was created to study harmful Internet phenomena such as cyberbullying and hate speech, which recently dramatically gain on numbers in Polish Internet as well as worldwide. The dataset was automatically collected from Polish Twitter accounts and annotated by both layperson volunteers under the supervision of a cyberbullying and hate-speech expert. Together with the dataset we propose the first open shared task for Polish to utilize the dataset in classification of such harmful phenomena. In particular, we propose two subtasks: 1) binary classification of harmful and non-harmful tweets, and 2) multiclass classification between two types of harmful information (cyberbullying and hate-speech), and other. The first installment of the shared task became a success by reaching fourteen overall submissions, hence proving a high demand for research applying such data.pl
dc.affiliationWydział Studiów Międzynarodowych i Politycznych : Instytut Bliskiego i Dalekiego Wschodupl
dc.conferencePolEval 2019 Workshop
dc.conference.cityWarszawa
dc.conference.countryPolska
dc.conference.datefinish2019-05-31
dc.conference.datestart2019-05-31
dc.conference.weblinkhttp://2019.poleval.pl/index.php/publication/pl
dc.contributor.authorPtaszynski, Michalpl
dc.contributor.authorPieciukiewicz, Agatapl
dc.contributor.authorDybała, Paweł - 242662 pl
dc.contributor.editorOgrodniczuk, Maciejpl
dc.contributor.editorKobyliński, Łukaszpl
dc.date.accession2020-03-23pl
dc.date.accessioned2020-03-23T18:15:51Z
dc.date.available2020-03-23T18:15:51Z
dc.date.issued2019pl
dc.date.openaccess0
dc.description.accesstimew momencie opublikowania
dc.description.additionalPrzypisy. Bibliogr. s. 108-110pl
dc.description.conftypeinternationalpl
dc.description.physical89-110pl
dc.description.publication1,48pl
dc.description.versionostateczna wersja wydawcy
dc.identifier.isbn978-83-63159-28-3pl
dc.identifier.projectROD UJ / OPpl
dc.identifier.urihttps://ruj.uj.edu.pl/xmlui/handle/item/152265
dc.identifier.weblinkhttp://2019.poleval.pl/files/poleval2019.pdfpl
dc.languageengpl
dc.language.containerengpl
dc.pubinfoWarszawa : Institute of Computer Sciences. Polish Academy of Sciencespl
dc.publisher.ministerialPolska Akademia Naukpl
dc.rightsUdzielam licencji. Uznanie autorstwa 4.0 Międzynarodowa*
dc.rights.licenceCC-BY
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/legalcode.pl*
dc.share.typeinne
dc.sourceinfoliczba autorów 32; liczba stron 163; liczba arkuszy wydawniczych 10;pl
dc.subject.encyberbullyingpl
dc.subject.enautomatic cyberbullying detectionpl
dc.subject.enhate-speechpl
dc.subject.ennatural language processingpl
dc.subject.enmachine learningpl
dc.subtypeConferenceProceedingspl
dc.titleResults of the PolEval 2019 Shared Task 6 : first dataset and Open Shared Task for automatic cyberbullying detection in Polish Twitterpl
dc.title.containerProceedings of the PolEval 2019 Workshoppl
dc.typeBookSectionpl
dspace.entity.typePublication
dc.abstract.enpl
In this paper we describe the first dataset for the Polish language containing annotations of harmful and toxic language. The dataset was created to study harmful Internet phenomena such as cyberbullying and hate speech, which recently dramatically gain on numbers in Polish Internet as well as worldwide. The dataset was automatically collected from Polish Twitter accounts and annotated by both layperson volunteers under the supervision of a cyberbullying and hate-speech expert. Together with the dataset we propose the first open shared task for Polish to utilize the dataset in classification of such harmful phenomena. In particular, we propose two subtasks: 1) binary classification of harmful and non-harmful tweets, and 2) multiclass classification between two types of harmful information (cyberbullying and hate-speech), and other. The first installment of the shared task became a success by reaching fourteen overall submissions, hence proving a high demand for research applying such data.
dc.affiliationpl
Wydział Studiów Międzynarodowych i Politycznych : Instytut Bliskiego i Dalekiego Wschodu
dc.conference
PolEval 2019 Workshop
dc.conference.city
Warszawa
dc.conference.country
Polska
dc.conference.datefinish
2019-05-31
dc.conference.datestart
2019-05-31
dc.conference.weblinkpl
http://2019.poleval.pl/index.php/publication/
dc.contributor.authorpl
Ptaszynski, Michal
dc.contributor.authorpl
Pieciukiewicz, Agata
dc.contributor.authorpl
Dybała, Paweł - 242662
dc.contributor.editorpl
Ogrodniczuk, Maciej
dc.contributor.editorpl
Kobyliński, Łukasz
dc.date.accessionpl
2020-03-23
dc.date.accessioned
2020-03-23T18:15:51Z
dc.date.available
2020-03-23T18:15:51Z
dc.date.issuedpl
2019
dc.date.openaccess
0
dc.description.accesstime
w momencie opublikowania
dc.description.additionalpl
Przypisy. Bibliogr. s. 108-110
dc.description.conftypepl
international
dc.description.physicalpl
89-110
dc.description.publicationpl
1,48
dc.description.version
ostateczna wersja wydawcy
dc.identifier.isbnpl
978-83-63159-28-3
dc.identifier.projectpl
ROD UJ / OP
dc.identifier.uri
https://ruj.uj.edu.pl/xmlui/handle/item/152265
dc.identifier.weblinkpl
http://2019.poleval.pl/files/poleval2019.pdf
dc.languagepl
eng
dc.language.containerpl
eng
dc.pubinfopl
Warszawa : Institute of Computer Sciences. Polish Academy of Sciences
dc.publisher.ministerialpl
Polska Akademia Nauk
dc.rights*
Udzielam licencji. Uznanie autorstwa 4.0 Międzynarodowa
dc.rights.licence
CC-BY
dc.rights.uri*
http://creativecommons.org/licenses/by/4.0/legalcode.pl
dc.share.type
inne
dc.sourceinfopl
liczba autorów 32; liczba stron 163; liczba arkuszy wydawniczych 10;
dc.subject.enpl
cyberbullying
dc.subject.enpl
automatic cyberbullying detection
dc.subject.enpl
hate-speech
dc.subject.enpl
natural language processing
dc.subject.enpl
machine learning
dc.subtypepl
ConferenceProceedings
dc.titlepl
Results of the PolEval 2019 Shared Task 6 : first dataset and Open Shared Task for automatic cyberbullying detection in Polish Twitter
dc.title.containerpl
Proceedings of the PolEval 2019 Workshop
dc.typepl
BookSection
dspace.entity.type
Publication
Affiliations

* The migration of download and view statistics prior to the date of April 8, 2024 is in progress.

Views
143
Views per month
Views per city
Wroclaw
12
Warsaw
10
Poznan
9
Gdansk
8
Krakow
5
Chandler
4
Los Angeles
4
Bhubaneswar
3
Ilford
3
Lahore
3
Downloads
ptaszynski_pieciukiewicz_dybala_results_of_the_poleval_2019.pdf
199
ptaszynski_pieciukiewicz_dybala_results_of_the_poleval_2019.odt
25