Simple view
Full metadata view
Authors
Statistics
Analysis of compounds activity concept learned by SVM using robust Jaccard based low-dimensional embedding
support vector machines
locally sensitive hashing
jaccard similarity
Support Vector Machines (SVM) with RBF kernel is one of the most successful models in machine learning based compounds biological activity prediction. Unfortunately, existing datasets are highly skewed and hard to analyze. During our research we try to answer the question how deep is activity concept modeled by SVM. We perform analysis using a model which embeds compounds' representations in a low-dimensional real space using near neigh- bour search with Jaccard similarity. As a result we show that concepts learned by SVM is not much more complex than slightly richer nearest neighbours search. As an additional result, we propose a classi cation technique, based on locally sensitive hashing approximating the Jaccard similarity through minhashing technique, which performs well on 80 tested datasets (consisting of 10 proteins with 8 di erent representations) while in the same time allows fast classi cation and ecient online training.
cris.lastimport.scopus | 2024-04-07T14:38:54Z | |
cris.lastimport.wos | 2024-04-10T02:13:29Z | |
dc.abstract.en | Support Vector Machines (SVM) with RBF kernel is one of the most successful models in machine learning based compounds biological activity prediction. Unfortunately, existing datasets are highly skewed and hard to analyze. During our research we try to answer the question how deep is activity concept modeled by SVM. We perform analysis using a model which embeds compounds' representations in a low-dimensional real space using near neigh- bour search with Jaccard similarity. As a result we show that concepts learned by SVM is not much more complex than slightly richer nearest neighbours search. As an additional result, we propose a classi cation technique, based on locally sensitive hashing approximating the Jaccard similarity through minhashing technique, which performs well on 80 tested datasets (consisting of 10 proteins with 8 di erent representations) while in the same time allows fast classi cation and ecient online training. | pl |
dc.affiliation | Wydział Matematyki i Informatyki : Instytut Informatyki i Matematyki Komputerowej | pl |
dc.contributor.author | Jastrzębski, Stanisław - 207335 | pl |
dc.contributor.author | Czarnecki, Wojciech - 115076 | pl |
dc.date.accessioned | 2016-06-16T12:21:52Z | |
dc.date.available | 2016-06-16T12:21:52Z | |
dc.date.issued | 2015 | pl |
dc.date.openaccess | 0 | |
dc.description.accesstime | w momencie opublikowania | |
dc.description.physical | 9-19 | pl |
dc.description.version | ostateczna wersja wydawcy | |
dc.description.volume | 24 | pl |
dc.identifier.doi | 10.4467/20838476SI.15.001.3023 | pl |
dc.identifier.eissn | 2083-8476 | pl |
dc.identifier.issn | 1732-3916 | pl |
dc.identifier.project | ROD UJ / P | pl |
dc.identifier.uri | http://ruj.uj.edu.pl/xmlui/handle/item/28039 | |
dc.language | eng | pl |
dc.language.container | eng | pl |
dc.rights | Dodaję tylko opis bibliograficzny | * |
dc.rights.licence | OTHER | |
dc.rights.uri | http://ruj.uj.edu.pl/4dspace/License/copyright/licencja_copyright.pdf | * |
dc.share.type | otwarte czasopismo | |
dc.subject.en | support vector machines | pl |
dc.subject.en | locally sensitive hashing | pl |
dc.subject.en | jaccard similarity | pl |
dc.subtype | Article | pl |
dc.title | Analysis of compounds activity concept learned by SVM using robust Jaccard based low-dimensional embedding | pl |
dc.title.journal | Schedae Informaticae | pl |
dc.type | JournalArticle | pl |
dspace.entity.type | Publication |