Towards learning word representation

2016
journal article
article
dc.abstract.enContinuous vector representations, as a distributed representations for words have gained a lot of attention in Natural Language Processing (NLP) field. Although they are considered as valuable methods to model both semantic and syntactic features, they still may be improved. For instance, the open issue seems to be to develop different strategies to introduce the knowledge about the morphology of words. It is a core point in case of either dense languages where many rare words appear and texts which have numerous metaphors or similies. In this paper, we extend a recent approach to represent word information. The underlying idea of our technique is to present a word in form of a bag of syllable and letter n-grams. More specifically, we provide a vector representation for each extracted syllable-based and letter-based n-gram, and perform concatenation. Moreover, in contrast to the previous method, we accept n-grams of varied length n. Further various experiments, like tasks-word similarity ranking or sentiment analysis report our method is competitive with respect to other state-of-theart techniques and takes a step toward more informative word representation construction.pl
dc.contributor.authorWiercioch, Magdalena - 208738 pl
dc.date.accessioned2017-10-18T07:38:55Z
dc.date.available2017-10-18T07:38:55Z
dc.date.issued2016pl
dc.date.openaccess36
dc.description.accesstimepo opublikowaniu
dc.description.additionalStrona wydawcy: https://www.wuj.plpl
dc.description.physical103-115pl
dc.description.versionostateczna wersja wydawcy
dc.description.volume25pl
dc.identifier.doi10.4467/20838476SI.16.008.6189pl
dc.identifier.eissn2083-8476pl
dc.identifier.issn1732-3916pl
dc.identifier.projectROD UJ / Ppl
dc.identifier.urihttps://ruj.uj.edu.pl/xmlui/handle/item/45270
dc.languageengpl
dc.language.containerengpl
dc.rightsDozwolony użytek utworów chronionych*
dc.rights.licenceInna otwarta licencja
dc.rights.urihttp://ruj.uj.edu.pl/4dspace/License/copyright/licencja_copyright.pdf*
dc.share.typeotwarte repozytorium
dc.source.integratorfalse
dc.subject.enrepresentation learningpl
dc.subject.enNLPpl
dc.subject.enn-gram modelpl
dc.subtypeArticlepl
dc.titleTowards learning word representationpl
dc.title.journalSchedae Informaticaepl
dc.typeJournalArticlepl
dspace.entity.typePublication
dc.abstract.enpl
Continuous vector representations, as a distributed representations for words have gained a lot of attention in Natural Language Processing (NLP) field. Although they are considered as valuable methods to model both semantic and syntactic features, they still may be improved. For instance, the open issue seems to be to develop different strategies to introduce the knowledge about the morphology of words. It is a core point in case of either dense languages where many rare words appear and texts which have numerous metaphors or similies. In this paper, we extend a recent approach to represent word information. The underlying idea of our technique is to present a word in form of a bag of syllable and letter n-grams. More specifically, we provide a vector representation for each extracted syllable-based and letter-based n-gram, and perform concatenation. Moreover, in contrast to the previous method, we accept n-grams of varied length n. Further various experiments, like tasks-word similarity ranking or sentiment analysis report our method is competitive with respect to other state-of-theart techniques and takes a step toward more informative word representation construction.
dc.contributor.authorpl
Wiercioch, Magdalena - 208738
dc.date.accessioned
2017-10-18T07:38:55Z
dc.date.available
2017-10-18T07:38:55Z
dc.date.issuedpl
2016
dc.date.openaccess
36
dc.description.accesstime
po opublikowaniu
dc.description.additionalpl
Strona wydawcy: https://www.wuj.pl
dc.description.physicalpl
103-115
dc.description.version
ostateczna wersja wydawcy
dc.description.volumepl
25
dc.identifier.doipl
10.4467/20838476SI.16.008.6189
dc.identifier.eissnpl
2083-8476
dc.identifier.issnpl
1732-3916
dc.identifier.projectpl
ROD UJ / P
dc.identifier.uri
https://ruj.uj.edu.pl/xmlui/handle/item/45270
dc.languagepl
eng
dc.language.containerpl
eng
dc.rights*
Dozwolony użytek utworów chronionych
dc.rights.licence
Inna otwarta licencja
dc.rights.uri*
http://ruj.uj.edu.pl/4dspace/License/copyright/licencja_copyright.pdf
dc.share.type
otwarte repozytorium
dc.source.integrator
false
dc.subject.enpl
representation learning
dc.subject.enpl
NLP
dc.subject.enpl
n-gram model
dc.subtypepl
Article
dc.titlepl
Towards learning word representation
dc.title.journalpl
Schedae Informaticae
dc.typepl
JournalArticle
dspace.entity.type
Publication
Affiliations

* The migration of download and view statistics prior to the date of April 8, 2024 is in progress.

Views
0
Views per month
Downloads
wiercioch_towards_learning_word_representation_2016.pdf
31
wiercioch_towards_learning_word_representation_2016.odt
8