Simple view
Full metadata view
Authors
Statistics
HAPP : high-accuracy pipeline for processing deep metabarcoding data
Deep metabarcoding offers an efficient and reproducible approach to biodiversity monitoring, but noisy data and incomplete reference databases challenge accurate diversity estimation and taxonomic annotation. Here, we introduce a novel algorithm, NEEAT, for removing spurious operational taxonomic units (OTUs) originating from nuclear-embedded mitochondrial DNA sequences (NUMTs) or sequencing errors. It integrates ‘echo’ signals across samples with the identification of unusual evolutionary patterns among similar DNA sequences. We also extensively benchmark current tools for chimera removal, taxonomic annotation and OTU clustering of deep metabarcoding data. The best performing tools/parameter settings are integrated into HAPP, a high-accuracy pipeline for processing deep metabarcoding data. Tests using CO1 data from BOLD and large-scale metabarcoding data on insects demonstrate that HAPP significantly outperforms existing methods, while enabling efficient analysis of extensive datasets by parallelizing computations across taxonomic groups.
| dc.abstract.en | Deep metabarcoding offers an efficient and reproducible approach to biodiversity monitoring, but noisy data and incomplete reference databases challenge accurate diversity estimation and taxonomic annotation. Here, we introduce a novel algorithm, NEEAT, for removing spurious operational taxonomic units (OTUs) originating from nuclear-embedded mitochondrial DNA sequences (NUMTs) or sequencing errors. It integrates ‘echo’ signals across samples with the identification of unusual evolutionary patterns among similar DNA sequences. We also extensively benchmark current tools for chimera removal, taxonomic annotation and OTU clustering of deep metabarcoding data. The best performing tools/parameter settings are integrated into HAPP, a high-accuracy pipeline for processing deep metabarcoding data. Tests using CO1 data from BOLD and large-scale metabarcoding data on insects demonstrate that HAPP significantly outperforms existing methods, while enabling efficient analysis of extensive datasets by parallelizing computations across taxonomic groups. | |
| dc.affiliation | Wydział Biologii : Instytut Nauk o Środowisku | |
| dc.contributor.author | Sundh, John | |
| dc.contributor.author | Granqvist, Emma | |
| dc.contributor.author | Iwaszkiewicz-Eggebrecht, Ela | |
| dc.contributor.author | Manoharan, Lokeshwaran | |
| dc.contributor.author | van Dijk, Laura J. A. | |
| dc.contributor.author | Goodsell, Robert | |
| dc.contributor.author | Godeiro, Nerivania N. | |
| dc.contributor.author | Bellini, Bruno C. | |
| dc.contributor.author | Orsholm, Johanna | |
| dc.contributor.author | Łukasik, Piotr - 398824 | |
| dc.contributor.author | Miraldo, Andreia | |
| dc.contributor.author | Roslin, Tomas | |
| dc.contributor.author | Tack, Ayco J. M. | |
| dc.contributor.author | Andersson, Anders F. | |
| dc.contributor.author | Ronquist, Fredrik | |
| dc.date.accessioned | 2025-11-21T15:40:51Z | |
| dc.date.available | 2025-11-21T15:40:51Z | |
| dc.date.createdat | 2025-11-19T09:38:49Z | en |
| dc.date.issued | 2025 | |
| dc.date.openaccess | 0 | |
| dc.description.accesstime | w momencie opublikowania | |
| dc.description.number | 11 | |
| dc.description.version | ostateczna wersja wydawcy | |
| dc.description.volume | 21 | |
| dc.identifier.articleid | e1013558 | |
| dc.identifier.doi | 10.1371/journal.pcbi.1013558 | |
| dc.identifier.eissn | 1553-7358 | |
| dc.identifier.issn | 1553-734X | |
| dc.identifier.project | DRC AI | |
| dc.identifier.uri | https://ruj.uj.edu.pl/handle/item/565862 | |
| dc.language | eng | |
| dc.language.container | eng | |
| dc.rights | Udzielam licencji. Uznanie autorstwa 4.0 Międzynarodowa | |
| dc.rights.licence | CC-BY | |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/legalcode.pl | |
| dc.share.type | otwarte czasopismo | |
| dc.subtype | Article | |
| dc.title | HAPP : high-accuracy pipeline for processing deep metabarcoding data | |
| dc.title.journal | PLoS Computational Biology | |
| dc.type | JournalArticle | |
| dspace.entity.type | Publication | en |