Simple view
Full metadata view
Authors
Statistics
Results of the PolEval 2019 Shared Task 6 : first dataset and Open Shared Task for automatic cyberbullying detection in Polish Twitter
cyberbullying
automatic cyberbullying detection
hate-speech
natural language processing
machine learning
Przypisy. Bibliogr. s. 108-110
In this paper we describe the first dataset for the Polish language containing annotations of harmful and toxic language. The dataset was created to study harmful Internet phenomena such as cyberbullying and hate speech, which recently dramatically gain on numbers in Polish Internet as well as worldwide. The dataset was automatically collected from Polish Twitter accounts and annotated by both layperson volunteers under the supervision of a cyberbullying and hate-speech expert. Together with the dataset we propose the first open shared task for Polish to utilize the dataset in classification of such harmful phenomena. In particular, we propose two subtasks: 1) binary classification of harmful and non-harmful tweets, and 2) multiclass classification between two types of harmful information (cyberbullying and hate-speech), and other. The first installment of the shared task became a success by reaching fourteen overall submissions, hence proving a high demand for research applying such data.
dc.abstract.en | In this paper we describe the first dataset for the Polish language containing annotations of harmful and toxic language. The dataset was created to study harmful Internet phenomena such as cyberbullying and hate speech, which recently dramatically gain on numbers in Polish Internet as well as worldwide. The dataset was automatically collected from Polish Twitter accounts and annotated by both layperson volunteers under the supervision of a cyberbullying and hate-speech expert. Together with the dataset we propose the first open shared task for Polish to utilize the dataset in classification of such harmful phenomena. In particular, we propose two subtasks: 1) binary classification of harmful and non-harmful tweets, and 2) multiclass classification between two types of harmful information (cyberbullying and hate-speech), and other. The first installment of the shared task became a success by reaching fourteen overall submissions, hence proving a high demand for research applying such data. | pl |
dc.affiliation | Wydział Studiów Międzynarodowych i Politycznych : Instytut Bliskiego i Dalekiego Wschodu | pl |
dc.conference | PolEval 2019 Workshop | |
dc.conference.city | Warszawa | |
dc.conference.country | Polska | |
dc.conference.datefinish | 2019-05-31 | |
dc.conference.datestart | 2019-05-31 | |
dc.conference.weblink | http://2019.poleval.pl/index.php/publication/ | pl |
dc.contributor.author | Ptaszynski, Michal | pl |
dc.contributor.author | Pieciukiewicz, Agata | pl |
dc.contributor.author | Dybała, Paweł - 242662 | pl |
dc.contributor.editor | Ogrodniczuk, Maciej | pl |
dc.contributor.editor | Kobyliński, Łukasz | pl |
dc.date.accession | 2020-03-23 | pl |
dc.date.accessioned | 2020-03-23T18:15:51Z | |
dc.date.available | 2020-03-23T18:15:51Z | |
dc.date.issued | 2019 | pl |
dc.date.openaccess | 0 | |
dc.description.accesstime | w momencie opublikowania | |
dc.description.additional | Przypisy. Bibliogr. s. 108-110 | pl |
dc.description.conftype | international | pl |
dc.description.physical | 89-110 | pl |
dc.description.publication | 1,48 | pl |
dc.description.version | ostateczna wersja wydawcy | |
dc.identifier.isbn | 978-83-63159-28-3 | pl |
dc.identifier.project | ROD UJ / OP | pl |
dc.identifier.uri | https://ruj.uj.edu.pl/xmlui/handle/item/152265 | |
dc.identifier.weblink | http://2019.poleval.pl/files/poleval2019.pdf | pl |
dc.language | eng | pl |
dc.language.container | eng | pl |
dc.pubinfo | Warszawa : Institute of Computer Sciences. Polish Academy of Sciences | pl |
dc.publisher.ministerial | Polska Akademia Nauk | pl |
dc.rights | Udzielam licencji. Uznanie autorstwa 4.0 Międzynarodowa | * |
dc.rights.licence | CC-BY | |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/legalcode.pl | * |
dc.share.type | inne | |
dc.sourceinfo | liczba autorów 32; liczba stron 163; liczba arkuszy wydawniczych 10; | pl |
dc.subject.en | cyberbullying | pl |
dc.subject.en | automatic cyberbullying detection | pl |
dc.subject.en | hate-speech | pl |
dc.subject.en | natural language processing | pl |
dc.subject.en | machine learning | pl |
dc.subtype | ConferenceProceedings | pl |
dc.title | Results of the PolEval 2019 Shared Task 6 : first dataset and Open Shared Task for automatic cyberbullying detection in Polish Twitter | pl |
dc.title.container | Proceedings of the PolEval 2019 Workshop | pl |
dc.type | BookSection | pl |
dspace.entity.type | Publication |
* The migration of download and view statistics prior to the date of April 8, 2024 is in progress.
Views
143
Views per month
Views per city
Downloads
Open Access