The use of electronic historical dictionary data in corpus design

Bronikowska, Renata; Gruszczyński, Włodzimierz; Ogrodniczuk, Maciej; Woliński, Marcin

doi:10.4467/23005920SPL.16.003.4818

Simple view

Full metadata view

Authors

Statistics

The use of electronic historical dictionary data in corpus design

2016

journal article

article

10.4467/23005920SPL.16.003.4818

Journal

Studies in Polish Linguistics

A

13

Author

Bronikowska Renata

Gruszczyński Włodzimierz

Ogrodniczuk Maciej

Woliński Marcin

Volume

11

Number

2

Pages

47-56

ISSN

1732-8160

eISSN

2300-5920

Keywords in Polish

korpus tekstów

anotacja tekstu

słownik historyczny

korpus historyczny

język średniopolski

analiza gramatyczna

Keywords in English

text corpus

text annotation

historical dictionary

historical corpus

Middle Polish

inflectional analysis

Remarks

Bibliogr. s. 56

Language

English

Journal language

English

Abstract in Polish

W Pracowni Historii Języka Polskiego XVII i XVIII w. Instytutu Języka Polskiego Polskiej Akademii Nauk powstają obecnie dwie obszerne bazy danych: Elektroniczny słownik języka polskiego XVII i XVIII w. oraz Elektroniczny korpus tekstów polskich z XVII i XVIII w. (do roku 1772) - ten ostatni we współpracy z Instytutem Podstaw Informatyki PAN. Połączenie tych dwóch zasobów może pomóc zrealizować cele obu projektów. Niniejszy artykuł przedstawia korzyści, jakie mogą odnieść twórcy korpusu, używając danych słownika, m.in poprzez wykorzystanie informacji gramatycznej z haseł słownika do budowy narzędzi do automatycznej anotacji tekstu.

Abstract in English

The History of the 17th and 18th c. Polish Language Laboratory, Institute of Polish Language, Polish Academy of Sciences, is in the process of creating two large databases: The Electronic Dictionary of the 17th-18th c. Polish and The Electronic Corpus of the 17th and 18th c. Polish Texts (up to 1772), the latter in cooperation with the Institute of Computer Science, Polish Academy of Sciences. It is expected that combining these two sets of data will help to achieve the objectives established for both database projects. The present article shows the benefits that the Corpus creators can get from the data gathered in the dictionary, with special emphasis put on the use of grammatical information included in the dictionary entries to design tools for automatic text annotation in the Corpus.

cris.lastimport.wos	2024-04-10T02:21:00Z
dc.abstract.en	The History of the 17th and 18th c. Polish Language Laboratory, Institute of Polish Language, Polish Academy of Sciences, is in the process of creating two large databases: The Electronic Dictionary of the 17th-18th c. Polish and The Electronic Corpus of the 17th and 18th c. Polish Texts (up to 1772), the latter in cooperation with the Institute of Computer Science, Polish Academy of Sciences. It is expected that combining these two sets of data will help to achieve the objectives established for both database projects. The present article shows the benefits that the Corpus creators can get from the data gathered in the dictionary, with special emphasis put on the use of grammatical information included in the dictionary entries to design tools for automatic text annotation in the Corpus.	pl
dc.abstract.pl	W Pracowni Historii Języka Polskiego XVII i XVIII w. Instytutu Języka Polskiego Polskiej Akademii Nauk powstają obecnie dwie obszerne bazy danych: Elektroniczny słownik języka polskiego XVII i XVIII w. oraz Elektroniczny korpus tekstów polskich z XVII i XVIII w. (do roku 1772) - ten ostatni we współpracy z Instytutem Podstaw Informatyki PAN. Połączenie tych dwóch zasobów może pomóc zrealizować cele obu projektów. Niniejszy artykuł przedstawia korzyści, jakie mogą odnieść twórcy korpusu, używając danych słownika, m.in. poprzez wykorzystanie informacji gramatycznej z haseł słownika do budowy narzędzi do automatycznej anotacji tekstu.	pl
dc.contributor.author	Bronikowska, Renata	pl
dc.contributor.author	Gruszczyński, Włodzimierz	pl
dc.contributor.author	Ogrodniczuk, Maciej	pl
dc.contributor.author	Woliński, Marcin	pl
dc.date.accessioned	2019-05-22T15:05:24Z
dc.date.available	2019-05-22T15:05:24Z
dc.date.issued	2016	pl
dc.date.openaccess	0
dc.description.accesstime	w momencie opublikowania
dc.description.additional	Bibliogr. s. 56	pl
dc.description.number	2	pl
dc.description.physical	47-56	pl
dc.description.version	ostateczna wersja wydawcy
dc.description.volume	11	pl
dc.identifier.doi	10.4467/23005920SPL.16.003.4818	pl
dc.identifier.eissn	2300-5920	pl
dc.identifier.issn	1732-8160	pl
dc.identifier.project	ROD UJ / OP	pl
dc.identifier.uri	https://ruj.uj.edu.pl/xmlui/handle/item/75535
dc.language	eng	pl
dc.language.container	eng	pl
dc.rights	Udzielam licencji. Uznanie autorstwa - Użycie niekomercyjne - Na tych samych warunkach 4.0 Międzynarodowa	*
dc.rights.licence	CC-BY-NC-SA
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.pl	*
dc.share.type	otwarte czasopismo
dc.source.integrator	false
dc.subject.en	text corpus	pl
dc.subject.en	text annotation	pl
dc.subject.en	historical dictionary	pl
dc.subject.en	historical corpus	pl
dc.subject.en	Middle Polish	pl
dc.subject.en	inflectional analysis	pl
dc.subject.pl	korpus tekstów	pl
dc.subject.pl	anotacja tekstu	pl
dc.subject.pl	słownik historyczny	pl
dc.subject.pl	korpus historyczny	pl
dc.subject.pl	język średniopolski	pl
dc.subject.pl	analiza gramatyczna	pl
dc.subtype	Article	pl
dc.title	The use of electronic historical dictionary data in corpus design	pl
dc.title.journal	Studies in Polish Linguistics	pl
dc.type	JournalArticle	pl
dspace.entity.type	Publication

cris.lastimport.wos

2024-04-10T02:21:00Z

dc.abstract.enpl

The History of the 17th and 18th c. Polish Language Laboratory, Institute of Polish Language, Polish Academy of Sciences, is in the process of creating two large databases: The Electronic Dictionary of the 17th-18th c. Polish and The Electronic Corpus of the 17th and 18th c. Polish Texts (up to 1772), the latter in cooperation with the Institute of Computer Science, Polish Academy of Sciences. It is expected that combining these two sets of data will help to achieve the objectives established for both database projects. The present article shows the benefits that the Corpus creators can get from the data gathered in the dictionary, with special emphasis put on the use of grammatical information included in the dictionary entries to design tools for automatic text annotation in the Corpus.

dc.abstract.plpl

W Pracowni Historii Języka Polskiego XVII i XVIII w. Instytutu Języka Polskiego Polskiej Akademii Nauk powstają obecnie dwie obszerne bazy danych: Elektroniczny słownik języka polskiego XVII i XVIII w. oraz Elektroniczny korpus tekstów polskich z XVII i XVIII w. (do roku 1772) - ten ostatni we współpracy z Instytutem Podstaw Informatyki PAN. Połączenie tych dwóch zasobów może pomóc zrealizować cele obu projektów. Niniejszy artykuł przedstawia korzyści, jakie mogą odnieść twórcy korpusu, używając danych słownika, m.in. poprzez wykorzystanie informacji gramatycznej z haseł słownika do budowy narzędzi do automatycznej anotacji tekstu.

dc.contributor.authorpl

Bronikowska, Renata

dc.contributor.authorpl

Gruszczyński, Włodzimierz

dc.contributor.authorpl

Ogrodniczuk, Maciej

dc.contributor.authorpl

Woliński, Marcin

dc.date.accessioned

2019-05-22T15:05:24Z

dc.date.available

2019-05-22T15:05:24Z

dc.date.issuedpl

2016

dc.date.openaccess

0

dc.description.accesstime

w momencie opublikowania

dc.description.additionalpl

Bibliogr. s. 56

dc.description.numberpl

2

dc.description.physicalpl

47-56

dc.description.version

ostateczna wersja wydawcy

dc.description.volumepl

11

dc.identifier.doipl

10.4467/23005920SPL.16.003.4818

dc.identifier.eissnpl

2300-5920

dc.identifier.issnpl

1732-8160

dc.identifier.projectpl

ROD UJ / OP

dc.identifier.uri

https://ruj.uj.edu.pl/xmlui/handle/item/75535

dc.languagepl

eng

dc.language.containerpl

eng

dc.rights*

Udzielam licencji. Uznanie autorstwa - Użycie niekomercyjne - Na tych samych warunkach 4.0 Międzynarodowa

dc.rights.licence

CC-BY-NC-SA

dc.rights.uri*

http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.pl

dc.share.type

otwarte czasopismo

dc.source.integrator

false

dc.subject.enpl

text corpus

dc.subject.enpl

text annotation

dc.subject.enpl

historical dictionary

dc.subject.enpl

historical corpus

dc.subject.enpl

Middle Polish

dc.subject.enpl

inflectional analysis

dc.subject.plpl

korpus tekstów

dc.subject.plpl

anotacja tekstu

dc.subject.plpl

słownik historyczny

dc.subject.plpl

korpus historyczny

dc.subject.plpl

język średniopolski

dc.subject.plpl

analiza gramatyczna

dc.subtypepl

Article

dc.titlepl

The use of electronic historical dictionary data in corpus design

dc.title.journalpl

Studies in Polish Linguistics

dc.typepl

JournalArticle

dspace.entity.type

Publication

Affiliations

No affiliation

Bronikowska, Renata

Gruszczyński, Włodzimierz

Ogrodniczuk, Maciej

Woliński, Marcin

* The migration of download and view statistics prior to the date of April 8, 2024 is in progress.

Views

1 Views per month

Downloads

bronikowska_gruszczynski_ogrodniczuk_wolinski_the_use_of_electronic_historical_dictionary_data_2016.odt

109

bronikowska_gruszczynski_ogrodniczuk_wolinski_the_use_of_electronic_historical_dictionary_data_2016.pdf

39

Open Access

Files

bronikowska_gruszczynski_ogrodniczuk_wolinski_the_use_of_electronic_historical_dictionary_data_2016.pdfpdf 394.44 KB

bronikowska_gruszczynski_ogrodniczuk_wolinski_the_use_of_electronic_historical_dictionary_data_2016.odtodt 394.44 KB

License

Except as otherwise noted, this item is licensed under : Udzielam licencji. Uznanie autorstwa - Użycie niekomercyjne - Na tych samych warunkach 4.0 Międzynarodowa

Collections

2016, Vol. 11

Humanities

ROD UJ