Dynamics of language change : the case of Polish "barzo > bardzo"

Górski, Rafał

doi:10.4467/23005920SPL.21.007.14261

Simple view

Full metadata view

Authors

Statistics

Dynamics of language change : the case of Polish "barzo > bardzo"

2021

journal article

article

10.4467/23005920SPL.21.007.14261

Journal

Studies in Polish Linguistics

70

Author

Górski Rafał

Volume

16

Number

3

Pages

145-162

ISSN

1732-8160

eISSN

2300-5920

Keywords in Polish

językoznawstwo historyczne

zmiana językowa

okres średniopolski

językoznawstwo korpusowe

prawo Piotrowskiego

regresja logistyczna

Keywords in English

historical linguistics

language change

Middle Polish

corpus linguistics

Piotrowski's law

logistic regression

Remarks

Bibliogr. s. 161-162

Language

English

Journal language

English

Abstract in Polish

W artykule omówiono korzyści płynące z modelowania zmiany językowej za pomocą regresji logistycznej, a także ograniczenia tej metody. Fakt, że zmiana taka powinna dać się opisać we wspomniany sposób, jest nazywany prawem Piotrowskiego-Altmanna. Ilustrujemy to przykładem izolowanej zmiany, jaka wystąpiła w języku średniopolskim, a mianowicie przejściem barzo > bardzo. Dane pozyskano z historycznego korpusu języka polskiego składającego się z kilkuset tekstów i liczącego około 12 milionów słów. Regresja logistyczna oparta na całym zbiorze danych wykazuje dobre dopasowanie, wciąż jednak istnieją pewne punkty, szczególnie pod koniec procesu, które są dość daleko od wyidealizowanej trajektorii. W artykule autor stara się odpowiedzieć na pytanie, w jakim stopniu jakość korpusu wpływa na model. W tym celu przeprowadzano eksperyment: z istniejącego korpusu usuwana jest losowo pewna liczba tekstów, tak aby stworzyć mniejsze korpusy zawierające 90%, 75% i 50% tekstów korpusu wyjściowego. Ponieważ taką procedurę powtarza się 200 razy, możliwe jest porównanie rozkładu wyników wskazujących na dopasowanie modelu. Wyniki wskazują, że im mniejszy korpus, tym większy rozrzut miary dobroci dopasowania, w skrajnych wypadkach nawet lepszy niż dla pełnego korpusu. Większe korpusy dają jednak na ogół lepsze wyniki dopasowania.

Abstract in English

The paper discusses the benefits and shortcomings of modelling a language change with logistic regression, an approach often called the Piotrowski-Altmann law. It is shown with an example of an isolated change, which occurred in Middle Polish, namely barzo > bardzo. The study is based on a historical corpus of Polish consisting of several hundreds of texts with over 12 million running words. Logistic regression based on the entire dataset shows relatively high goodness of fit, still there are some data points, especially close to the end of the process, which are quite far removed from the idealised trajectory. In the article, the author seeks to answer the question: to what extent the quality of the corpus affects the model. An experiment was conducted: a number of texts were randomly removed in order to create a smaller corpus, containing 90%, 75% and 50% of the texts of the entire set. Since such procedure is repeated 200 times, it is possible to compare the distribution of the scores indicating the goodness of fit of the model. It turns out that the smaller the corpus, the more diverse the goodness of fit, and in some rare cases it is even better than its counterpart for a larger corpus. Still the larger the corpus, the scores indicating goodness of fit tend to be higher.

dc.abstract.en	The paper discusses the benefits and shortcomings of modelling a language change with logistic regression, an approach often called the Piotrowski-Altmann law. It is shown with an example of an isolated change, which occurred in Middle Polish, namely barzo > bardzo. The study is based on a historical corpus of Polish consisting of several hundreds of texts with over 12 million running words. Logistic regression based on the entire dataset shows relatively high goodness of fit, still there are some data points, especially close to the end of the process, which are quite far removed from the idealised trajectory. In the article, the author seeks to answer the question: to what extent the quality of the corpus affects the model. An experiment was conducted: a number of texts were randomly removed in order to create a smaller corpus, containing 90%, 75% and 50% of the texts of the entire set. Since such procedure is repeated 200 times, it is possible to compare the distribution of the scores indicating the goodness of fit of the model. It turns out that the smaller the corpus, the more diverse the goodness of fit, and in some rare cases it is even better than its counterpart for a larger corpus. Still the larger the corpus, the scores indicating goodness of fit tend to be higher.	pl
dc.abstract.pl	W artykule omówiono korzyści płynące z modelowania zmiany językowej za pomocą regresji logistycznej, a także ograniczenia tej metody. Fakt, że zmiana taka powinna dać się opisać we wspomniany sposób, jest nazywany prawem Piotrowskiego-Altmanna. Ilustrujemy to przykładem izolowanej zmiany, jaka wystąpiła w języku średniopolskim, a mianowicie przejściem barzo > bardzo. Dane pozyskano z historycznego korpusu języka polskiego składającego się z kilkuset tekstów i liczącego około 12 milionów słów. Regresja logistyczna oparta na całym zbiorze danych wykazuje dobre dopasowanie, wciąż jednak istnieją pewne punkty, szczególnie pod koniec procesu, które są dość daleko od wyidealizowanej trajektorii. W artykule autor stara się odpowiedzieć na pytanie, w jakim stopniu jakość korpusu wpływa na model. W tym celu przeprowadzano eksperyment: z istniejącego korpusu usuwana jest losowo pewna liczba tekstów, tak aby stworzyć mniejsze korpusy zawierające 90%, 75% i 50% tekstów korpusu wyjściowego. Ponieważ taką procedurę powtarza się 200 razy, możliwe jest porównanie rozkładu wyników wskazujących na dopasowanie modelu. Wyniki wskazują, że im mniejszy korpus, tym większy rozrzut miary dobroci dopasowania, w skrajnych wypadkach nawet lepszy niż dla pełnego korpusu. Większe korpusy dają jednak na ogół lepsze wyniki dopasowania.	pl
dc.contributor.author	Górski, Rafał - 199117	pl
dc.date.accessioned	2021-11-08T07:46:07Z
dc.date.available	2021-11-08T07:46:07Z
dc.date.issued	2021	pl
dc.date.openaccess	0
dc.description.accesstime	w momencie opublikowania
dc.description.additional	Bibliogr. s. 161-162	pl
dc.description.number	3	pl
dc.description.physical	145-162	pl
dc.description.version	ostateczna wersja wydawcy
dc.description.volume	16	pl
dc.identifier.doi	10.4467/23005920SPL.21.007.14261	pl
dc.identifier.eissn	2300-5920	pl
dc.identifier.issn	1732-8160	pl
dc.identifier.uri	https://ruj.uj.edu.pl/xmlui/handle/item/283046
dc.language	eng	pl
dc.language.container	eng	pl
dc.rights	Udzielam licencji. Uznanie autorstwa - Użycie niekomercyjne - Na tych samych warunkach 4.0 Międzynarodowa	*
dc.rights.licence	CC-BY-NC-ND
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.pl	*
dc.share.type	otwarte czasopismo
dc.subject.en	historical linguistics	pl
dc.subject.en	language change	pl
dc.subject.en	Middle Polish	pl
dc.subject.en	corpus linguistics	pl
dc.subject.en	Piotrowski's law	pl
dc.subject.en	logistic regression	pl
dc.subject.pl	językoznawstwo historyczne	pl
dc.subject.pl	zmiana językowa	pl
dc.subject.pl	okres średniopolski	pl
dc.subject.pl	językoznawstwo korpusowe	pl
dc.subject.pl	prawo Piotrowskiego	pl
dc.subject.pl	regresja logistyczna	pl
dc.subtype	Article	pl
dc.title	Dynamics of language change : the case of Polish "barzo > bardzo"	pl
dc.title.journal	Studies in Polish Linguistics	pl
dc.type	JournalArticle	pl
dspace.entity.type	Publication

dc.abstract.enpl

The paper discusses the benefits and shortcomings of modelling a language change with logistic regression, an approach often called the Piotrowski-Altmann law. It is shown with an example of an isolated change, which occurred in Middle Polish, namely barzo > bardzo. The study is based on a historical corpus of Polish consisting of several hundreds of texts with over 12 million running words. Logistic regression based on the entire dataset shows relatively high goodness of fit, still there are some data points, especially close to the end of the process, which are quite far removed from the idealised trajectory. In the article, the author seeks to answer the question: to what extent the quality of the corpus affects the model. An experiment was conducted: a number of texts were randomly removed in order to create a smaller corpus, containing 90%, 75% and 50% of the texts of the entire set. Since such procedure is repeated 200 times, it is possible to compare the distribution of the scores indicating the goodness of fit of the model. It turns out that the smaller the corpus, the more diverse the goodness of fit, and in some rare cases it is even better than its counterpart for a larger corpus. Still the larger the corpus, the scores indicating goodness of fit tend to be higher.

dc.abstract.plpl

W artykule omówiono korzyści płynące z modelowania zmiany językowej za pomocą regresji logistycznej, a także ograniczenia tej metody. Fakt, że zmiana taka powinna dać się opisać we wspomniany sposób, jest nazywany prawem Piotrowskiego-Altmanna. Ilustrujemy to przykładem izolowanej zmiany, jaka wystąpiła w języku średniopolskim, a mianowicie przejściem barzo > bardzo. Dane pozyskano z historycznego korpusu języka polskiego składającego się z kilkuset tekstów i liczącego około 12 milionów słów. Regresja logistyczna oparta na całym zbiorze danych wykazuje dobre dopasowanie, wciąż jednak istnieją pewne punkty, szczególnie pod koniec procesu, które są dość daleko od wyidealizowanej trajektorii. W artykule autor stara się odpowiedzieć na pytanie, w jakim stopniu jakość korpusu wpływa na model. W tym celu przeprowadzano eksperyment: z istniejącego korpusu usuwana jest losowo pewna liczba tekstów, tak aby stworzyć mniejsze korpusy zawierające 90%, 75% i 50% tekstów korpusu wyjściowego. Ponieważ taką procedurę powtarza się 200 razy, możliwe jest porównanie rozkładu wyników wskazujących na dopasowanie modelu. Wyniki wskazują, że im mniejszy korpus, tym większy rozrzut miary dobroci dopasowania, w skrajnych wypadkach nawet lepszy niż dla pełnego korpusu. Większe korpusy dają jednak na ogół lepsze wyniki dopasowania.

dc.contributor.authorpl

Górski, Rafał - 199117

dc.date.accessioned

2021-11-08T07:46:07Z

dc.date.available

2021-11-08T07:46:07Z

dc.date.issuedpl

2021

dc.date.openaccess

0

dc.description.accesstime

w momencie opublikowania

dc.description.additionalpl

Bibliogr. s. 161-162

dc.description.numberpl

3

dc.description.physicalpl

145-162

dc.description.version

ostateczna wersja wydawcy

dc.description.volumepl

16

dc.identifier.doipl

10.4467/23005920SPL.21.007.14261

dc.identifier.eissnpl

2300-5920

dc.identifier.issnpl

1732-8160

dc.identifier.uri

https://ruj.uj.edu.pl/xmlui/handle/item/283046

dc.languagepl

eng

dc.language.containerpl

eng

dc.rights*

Udzielam licencji. Uznanie autorstwa - Użycie niekomercyjne - Na tych samych warunkach 4.0 Międzynarodowa

dc.rights.licence

CC-BY-NC-ND

dc.rights.uri*

http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode.pl

dc.share.type

otwarte czasopismo

dc.subject.enpl

historical linguistics

dc.subject.enpl

language change

dc.subject.enpl

Middle Polish

dc.subject.enpl

corpus linguistics

dc.subject.enpl

Piotrowski's law

dc.subject.enpl

logistic regression

dc.subject.plpl

językoznawstwo historyczne

dc.subject.plpl

zmiana językowa

dc.subject.plpl

okres średniopolski

dc.subject.plpl

językoznawstwo korpusowe

dc.subject.plpl

prawo Piotrowskiego

dc.subject.plpl

regresja logistyczna

dc.subtypepl

Article

dc.titlepl

Dynamics of language change : the case of Polish "barzo > bardzo"

dc.title.journalpl

Studies in Polish Linguistics

dc.typepl

JournalArticle

dspace.entity.type

Publication

Affiliations

No affiliation

Górski, Rafał

* The migration of download and view statistics prior to the date of April 8, 2024 is in progress.

Views

66 Views per month

Views per city

Krakow

8

Ashburn

5

New York

4

Dublin

3

Tychy

2

Warsaw

2

Wroclaw

2

Boydton

1

Chandler

1

Gdynia

1

Downloads

gorski_dynamics_of_language_change_2021.pdf

30

gorski_dynamics_of_language_change_2021.odt

4

Open Access

Files

gorski_dynamics_of_language_change_2021.pdfpdf 478.76 KB

gorski_dynamics_of_language_change_2021.odtodt 288.68 KB

License

Except as otherwise noted, this item is licensed under : Udzielam licencji. Uznanie autorstwa - Użycie niekomercyjne - Na tych samych warunkach 4.0 Międzynarodowa

Collections

2021, Vol. 16

Humanities

Research publications