Inferring hybridisation and introgression processes within Stipa (Poaceae) A dissertation presented by Evgenii Baiakhmetov to The Faculty of Biology in partial fulfilment of the requirements for the degree of Doctor of Philosophy in the specialisation of Botany Dissertation supervisor: prof. dr hab. Marcin Nobis, Jagiellonian University (Kraków, Poland) Dissertation co-supervisor: dr Polina D. Gudkova, Tomsk State University (Tomsk, Russia) Jagiellonian University Kraków, Poland 2022 Table of Contents Abstract...................................................................................3 Introduction...............................................................................5 The dissertation objectives............................................................10 Methodological framework...............................................................10 Data acquisition.......................................................................11 Data analysis..........................................................................11 Chapter 1: Morphological and genome-wide evidence for natural hybridisation within the genus Stipa (Poaceae).................................................................12 Article 1..............................................................................13 Supplementary material.................................................................28 Chapter 2: The first draft genome of feather grasses using SMRT sequencing and its implications in molecular studies of Stipa................................................39 Article 2..............................................................................40 Supplementary material.................................................................55 Chapter 3: Evidence for extensive hybridisation and past introgression events in feather grasses using genome-wide SNP genotyping..................................................65 Article 3..............................................................................66 Supplementary material.................................................................92 Final conclusions and perspectives.......................................................117 Acknowledgements.........................................................................122 References...............................................................................123 Abstract Among non-model plants, the genus Stipa plays an important role. Many feather grass species are dominants and/or subdominants in steppe plant communities and are commonly used for their classification. Nonetheless, the genus has a sophisticated taxonomic history where synonymisation and ample confusion were commonplace. Only recently, the advent of sequencing technologies has let systematicists to reassess the genus. As a result, Stipa s.s. currently comprises over 150 species native to the Old World. Generally, there are only few morphological characteristics with a persistent nature. Such a limited set of data complicates establishing reliable boundaries between species. Over the last 50 years several botanists have hypothesised that a potential explanation for the misleading species assignment is hybridisation that accounts for around 30% of feather grasses. The present thesis provides a methodological framework for taxa delimitation in the genus. Specifically, an integrative approach, utilised here, combined genome-wide SNP genotyping and classical morphometric data to infer hybridisation and introgression processes within Stipa. The work discusses the usefulness of applying integrative taxonomy to delimiting pure and admixed taxa in face of extensive hybridisation. Importantly, the dissertation for the first time provides a large-scale genomic data comprising the first nuclear and mitochondrial genomes for resolving phylogenetic and hybridisation issues within Stipa. Additionally, the research establishes divergence time for a selected number of taxa within feather grasses. The results support the hypothesis that hybridisation is an important mechanism driving evolution in Stipa. As an outcome, this phenomenon complicates identification of hybrid taxa in the field using morphological characters alone. Thus, integrative taxonomy seems to be the only reliable way to identify pure and admixed taxa and to properly resolve the phylogenetic issue of Stipa. Moreover, the thesis claims that Stipa may be a suitable genus to study hybridisation and introgression phenomena in nature. 3 Abstrakt Wśród roślin niemodelowych ważną rolę odgrywa rodzaj Stipa. Wiele ostnic jest gatunkami dominującymi w zbiorowiskach roślin stepowych i/lub są wykorzystywane do klasyfikacji zbiorowisk z ich udziałem. Rodzaj ma skomplikowaną historię taksonomiczną. Dopiero niedawno, wraz z pojawieniem się technologii sekwencjonowania, stała się możliwa ponowna rewizja taksonomiczna rodzaju z wykorzystaniem technik integracyjnych. W rezultacie, Stipa s.s. obejmuje ponad 150 gatunków rodzimych dla Starego Świata. Generalnie, tylko kilka do kilkunastu cech morfologicznych ma konserwatywny charakter i może być wykorzystane w taksonomii tego rodzaju. Taki ograniczony zestaw danych komplikuje niestety ustalenie trwałych granic między gatunkami. W ciągu ostatnich 50 lat kilku botaników wysunęło hipotezę, że potencjalnym wyjaśnieniem mylącego przyporządkowania gatunków jest hybrydyzacja, którą cechuje się ok. 30% ostnic. W niniejszej rozprawie doktorskiej przedstawiono nowe podejście dla ustalenia granic wybranych taksonów ostnic. Podejście integracyjne, łączące genotypowanie SNP markerów z całego genomu i klasyczne dane morfometryczne, było zastosowane dla wykrycia procesów hybrydyzacji i introgresji w obrębie rodzaju Stipa. W pracy omówiono użyteczność zastosowania taksonomii integracyjnej do rozgraniczenia czystych gatunków i mieszańców. Co ważne, rozprawa po raz pierwszy dostarcza wielkoskalowych danych genomicznych obejmujących genomy jądrowe i mitochondrialne w celu rozwiązania problemów filogenetycznych i hybrydyzacji w obrębie Stipa. Dodatkowo badania ustalają czas dywergencji wybranych taksonów w obrębie ostnic. Wyniki potwierdzają hipotezę, że hybrydyzacja jest ważnym mechanizmem napędzającym ewolucję ostnic. Zjawisko to komplikuje identyfikację taksonów hybrydowych w terenie przy użyciu samych cech morfologicznych. Zatem taksonomia integracyjna wydaje się być jedynym wiarygodnym sposobem identyfikacji czystych i mieszanych taksonów oraz właściwego rozwiązania problemu filogenetycznego ostnic. Ponadto w rozprawie stwierdzono, że Stipa może być odpowiednim rodzajem do badania zjawisk hybrydyzacji i introgresji w przyrodzie. 4 Introduction The Poaceae is indubitably one of the most important plant family to humankind because of its extraordinary role in the evolutionary history of numerous species of animals including Homo. A dramatic expansion of grasses in East Africa between 9 and 4 Mya (Jacobs et al., 1999) created novel selective pressures for early hominins. According to Rodman & McHenry (1980), human ancestors started to adapt to striding bipedalism in response to climate change. As forests shrank, new, more open, grassland habitats forced the human lineage into a more energetically efficient way of walking. Over time, grasses provided another major benefit for the evolution of humans. Archaeological records demonstrate that the wild ancestors of domesticated cereals were used as a source of food well before agriculture began (Henry et al., 2011; Arranz-Otaegui et al., 2018; Dietrich et al., 2020). Some studies evidence an early and fundamental shift in hominin dietary ecology that facilitated the exploitation of new habitats ca. 3-3.5 Mya (Sponheimer & Lee-Thorp, 1999; Lee-Thorp et al., 2012). Nowadays, ca. 70% of grasslands worldwide are croplands (Banwart et al., 2015). Global use of cultivated grasses is projected to increase from 2.7 gigatonnes in 2018-2020 to 3 gigatonnes by 2030 (OECD/FAO, 2021), which is nearly 50% of the worldwide supply of food energy. Besides the apparent economic aspects, many species of Poaceae play important ecological roles in nature. Due to their root systems and the interaction with microbes, grasses increase soil organic matter and consequently soil fertility over time. Additionally, grasslands that are permanently covered by vegetation reduce erosion risk and loss of topsoil particles; they improve water absorption by the substrate and decrease the risk of flooding; they are essential habitats for biodiversity, including numerous species of birds, insects and other organisms (Bengtsson et al., 2019). Furthermore, some non-crop grasses play significant roles in the understanding of plant biology. For instance, Brachypodium distachyon and Setaria viridis are immensely promising species for understanding and manipulating the genetics of Pooid and Panicoid grasses (Brutnell, 2015; Scholthof et al., 2018). Recently, there have also been multitudinous studies related to climate change in which a few non-model grasses were used, e.g., taxa from genera Lolium (Castellanos-Frías et al., 2016), Agropyron (Zhang et al., 2019), Stipa (Schubert et al., 2019), Festuca (Farashi & Karimian, 2021) and Poa (Scrivanti & Anton, 2021). 5 Among non-model grasses, the genus Stipa holds an important role. Besides various climatic studies (Yang et al., 2015; Song et al., 2016; Lv et al., 2016; Lv et al., 2019; Hu et al., 2020), many species of Stipa are dominants and/or subdominants in steppe plant communities and are commonly used for their classification (Danzhalova et al., 2012; Zhao et al., 2018; Zhu et al., 2018). Nonetheless, the genus has a sophisticated taxonomic history where synonymisation and ample confusion were commonplace. The genus was described in Species Plantarum (Linnaeus, 1753) with three taxa: S. pennata, S. juncea and S. avenacea (later Piptochaetium avenaceum). Over the last three centuries, the genus Stipa s.l. has been substantially enlarged accounting over 600 taxa described globally except in Antarctica. Numerous taxonomists contributed to the discovery and the description of new species. Among the first was a German-born scientist Carl Bernhard von Trinius who in collaboration with an Austrian-born naturalist Franz Joseph Ruprecht studied botanical concerns at the Russian Academy of Sciences (Trinius, 1829; Trinius, 1831; Trinius & Ruprecht, 1842). In the Americas the genus was largely defined by Carlo Luigi Spegazzini (1901; 1925), Albert Spear Hitchcock (1925a; 1925b; 1951) and Mary Elizabeth Barkworth (Barkworth & Everett 1987; Barkworth, 2007; Barkworth et al., 2008). In Africa and the Iberian Peninsula feather grasses were mainly described by René Maire (1953), Jan Otakar Martinovsky (1980), Hildemar Scholz (1989; 1991), Francisco María Vázquez and Juan Antonio Devesa Alcaraz (1996; 1997). In Australia the taxonomic revision of Stipa was performed by Hughes Dorothy Kate (1921: 1922), Joyce Winifred Vickery, Surrey Wilfrid Laurance Jacobs and Joy Everett (Vickery, 1951; Everett & Jacobs, 1983; Vickery et al., 1986). In Eurasia the essential contribution made, e.g., Roman Julievich Roshevitz (1916; 1920; 1929; 1934), Pavel Aleksandrovich Smirnov (1925; 1928; 1934), Norman Loftus Bor (1960; 1970), Nikolai Nikolaievich Tzvelev (1968; 1976), Jan Otakar Martinovsky (1980), Mikhail Vasilevich Klokov and Vitaliy Veniaminovich Osychnyuk (1976), Helmut Freitag (1985), Pung Chao Kuo and Yong-hua Sun (1987), Yuri Andreevich Kotukhov (2002), Zhen Lan Wu and Sylvia Mabel Phillips (2006), Marcin Nobis and Polina Dmitrievna Gudkova (Nobis, 2009; Nobis & Gudkova, 2016, Nobis et al., 2020). Over the past two decades, the advent of sequencing technologies has let systematicists reassess the genus. As a result, some species were transferred to newly erected or to other previously described genera within the tribe Stipeae. For instance, a Mediterranean species Stipa gigantea Link, commonly called the Giant Golden Oat or the Giant Feather Grass, currently represents a monotypic genus Celtica F. M. Vázquez 6 & Barkworth (Romaschenko et al., 2012; Govaerts et al., 2021). The Americas species were transferred to genera Amelichloa Arriaga & Barkworth, Jarava Ruiz & Pav., Nassella (Trin.) E. Desv. and Piptochaetium J. Presl. Various North American taxa were assigned to genera Hesperostipa (M. K. Elias) Barkworth, Piptatheropsis Romasch., P. M. Peterson & Soreng and Pseudoeriocoma Romasch., P. M. Peterson & Soreng, while several South American species were transferred to genera Anatherostipa (Hack. ex Kuntze) Peñailillo and Pappostipa (Speg.) Romasch., P. M. Peterson & Soreng (Romaschenko et al., 2012; Peterson et al., 2019; Govaerts et al., 2021). The former Australian Stipa species were mainly placed in Austrostipa S. W. L. Jacobs & J. Everett with one exception of Stipa arundinacea (Hook.f.) Benth. that presently represents a monotypic genus Anemanthele and has a new name A. lessoniana (Steud.) Veldkamp (Romaschenko et al., 2012; Peterson et al., 2019; Govaerts et al., 2021). A few Eurasian taxa were assigned to Achnatherum P. Beauv., Neotrinia (Tzvelev) M. Nobis, P. D. Gudkova & A. Nowak, Patis Ohwi, Piptatherum P. Beauv., Ptilagrostis Griseb., Timouria Roshev., Trikeraia Bor and a monotypic genus Orthoraphium Nees with a species O. roylei Nees, formerly S. orthoraphium Steud. (Romaschenko et al., 2012; Peterson et al., 2019; Govaerts et al., 2021). Thus, the genus Stipa s.s. currently comprises over 150 cool-season species common in Eurasia and North Africa (Nobis, 2014; Nobis et al., 2020). Nonetheless, although molecular barcodes redefined feather grasses as a monophyletic genus (Hamasha et al., 2012; Romaschenko et al., 2012), there is still no consensus over the infrageneric classification. Prior to the era of DNA barcoding, several prominent agrostologists were proposing their views on the issue exclusively based on morphology (Roshevitz, 1934; Martinovsky, 1966, 1967, 1970, 1976, 1980; Bor, 1970; Tzvelev 1974, 1976, 1993; Klokov & Osychnyuk, 1976; Freitag, 1985; Moraldo, 1986; Vázquez & Gutiérrez, 2011). For instance, Tzvelev (1974, 1976, 1993) proposed nine sections within feather grasses, specifically: (1) Achnatheropsis Tzvelev, (2) Barbatae Junge, (3) Leiostipa Dumort., (4) Pseudoptilagrostis Tzvelev, (5) Regelia Tzvelev, (6) Smirnovia Tzvelev, (7) Stipa, (8) Stipella Tzvelev and (9) Subbarbatae Tzvelev. On the other hand, e.g., Freitag (1985) also considered nine sections; however, with a different classification: (1) Achnatheropsis, (2) Aristella (Trin.) Steud., (3) Barbatae, (4) Lasiagrostis (Link) Steud., (5) Orthoraphium (Nees) Steud., (6) Pseudoptilagrostis, (7) Ptilagrostis, (8) Stipa and (9) Stipella. Notwithstanding, recently Stipa s.s. has been dismembered into only six sections: (1) Barbatae, (2) Leiostipa, (3) Pseudoptilagrostis, 7 (4) Regelia, (5) Stipa and (6) Smirnovia; nevertheless, subdivisions within the genus are still not consistently supported by available molecular data (Kellogg, 2015). A fundamental problem of resolving taxonomical issues in feather grasses solely based on phenotypic traits is a wide range of their plasticity. Generally, there are only few morphological characteristics with a persistent nature. Such a limited set of data complicates establishing reliable boundaries between species. Consequently, it leads to the challenges of delimiting infrageneric ranks (Tzvelev, 1976; Scholz, 1985; Freitag, 1985; Strid, 1991; Gonzalo et al., 2013; Nobis et al., 2016; Gao et al., 2018). In general, feather grasses represent caespitose plants with C3 photosynthesis; leaves frequently have convolute or involute blades; flowers have three lodicules and two or three stigmas (rarely four); glumes are acuminate and much longer than the flower, with three to seven veins; the callus of the flower is long, sharp, with the tip oblique or slightly curved to a sharp point, generally with long hairs; lemmas are with the margins overlapping, the apex lacking lobes; paleae are as long as the lemmas or slightly longer, the apex is partly wrinkled; awns can be unigeniculate or bigeniculate, scabrous to variously pilose reaching from 2 to 50 cm in length; the column is twisted (Kellogg, 2015; Nobis et al., 2020); the vast majority of species have 44 (2n=4x) chromosomes and are supposed to be tetraploids (Romaschenko et al., 2012; Tkach et al., 2021). Over the last 50 years several botanists have hypothesised that a potential explanation for the misleading species assignment in feather grasses is hybridisation (Smirnov, 1970; Tzvelev, 1976; Kotukhov, 2002; Nobis, 2013). Solely based on morphology, a hybrid origin can possess, e.g., S. x czerepanovii Kotukhov (= S. orientalis Trin. x S. richteriana Kar. & Kir.); S. x fallax M. Nobis & A. Nowak (S. drobovii (Tzvel.) Czer. x S. macroglossa P. A. Smirn. subsp. macroglossa); S. x gegarkunii P. A. Smirn. (= S. pulcherrima K. Koch x S. caucasica Schmalh.); S. x hissarica M. Nobis (= S. lipskyi Roshev. x S. orientalis Trin.); S. x kamelinii Kotukhov (= S. orientalis x S. zaissanica Kotukhov); S. x manrakica Kotukhov (= S. caucasica x S. macroglossa P. A. Smirn. subsp. kazachstanica (Kotukhov) M. Nobis); S. x tadzhikistanica M.Nobis (= S. lipskyi x S. caucasica); S. x talassica Pazij (= S. macroglossa x S. caucasica); S. x tzveleviana Kotukhov (= S. orientalis x S. macroglossa subsp. kazachstanica) and S. x zaissanica Kotukhov (= S. orientalis x S. hohenackeriana Trin. & Rupr.; Smirnov, 1970; Nobis, 2013; Nobis & Gudkova, 2016; Nobis et al., 2017; Nobis et al., 2020). Nevertheless, the assumption that hybridisation plays an important role in Stipa was needed to be verified. 8 Heretofore, only one study has been performed to validate a putative hybrid origin in feather grasses based on morphological and molecular data (Nobis et al., 2019). This research demonstrated that Stipa x heptapotamica Golosk. is a hybrid taxon, which parental species are S. richteriana Kar. & Kir. and S. lessingiana Trin. & Rupr. Surprisingly, these taxa are morphologically distant and presently represent two different sections Leiostipa and Subbarbatae (Tzvelev, 1974; 1976; 1993; 2012). Nevertheless, recent phylogenetic studies demonstrate that genetically S. richteriana and S. lessingiana are closely related (Krawczyk et al., 2017; Krawczyk et al., 2018). Furthermore, it was shown that Stipa x heptapotamica can produce fertile pollen grains and therefore is able to backcross with both parental species (Nobis et al., 2019). Nonetheless, the study also disclosed a substantial methodological drawback of using dominant ISSR (inter simple sequence repeat) markers (Zietkiewicz et al., 1994). On the one hand, it is a cost-efficient technique for the detection of DNA polymorphism. On the other hand, it is a timeconsuming approach that occasionally arises problems with reproducibility. However, the main limitation of ISSR markers is a low number of polymorphic loci in comparison to more advanced approaches, e.g., restriction-based high-throughput techniques RADseq (Baird et al., 2008), genotyping-by-sequencing (Elshire et al., 2011) and DArTseq (Kilian et al., 2012). Thus, since species boundaries among Stipa taxa may have a vague character due to probable hybridisation and introgression events present in the genus, new studies have been requiring a more reliable and time-efficient molecular approach to infer genetic structure within feather grasses. Moreover, considering that over 30% of Stipa species may have a hybrid origin (Nobis et al., 2019), it has been suggested to apply second-generation sequencing (better known as next-generation sequencing) to face phylogenetic and hybridisation issues in the genus. 9 The dissertation objectives The main goal of the thesis is to provide new knowledge on the existence of hybridisation and introgression processes within feather grasses using cutting-edge technologies and to elaborate an appropriate methodological framework for taxa delimitation. Specifically, the following objectives were set: 1. Assess the usefulness of applying a restriction-based high-throughput technique DArTseq for genome-wide SNP genotyping and inferring hybridisation and introgression processes within Stipa, mainly from the section Leiostipa. 2. Assess the usefulness of applying integrative taxonomy to delimiting pure and admixed taxa. 3. Provide a large-scale genomic data comprising the first nuclear and mitochondrial genomes for resolving phylogenetic and hybridisation issues within feather grasses. 4. Establish divergence time for a selected number of taxa within the genus. 5. Demonstrate that feather grasses may be a suitable genus to study hybridisation and introgression events in nature. Methodological framework Several methods were applied to address the research objectives. They are outlined in "Materials and methods" sections of the research articles under every respective chapter of the dissertation. Here I briefly itemise the main methods: 1. Light microscopy. 2. Scanning electron microscopy (SEM). 3. DNA isolation (a column-type extraction with a kit). 4. Diversity Arrays Technology (DArT) complexity reduction method. 5. Second-generation sequencing (Illumina platform). 6. Third-generation sequencing (PacBio platform). 10 Data acquisition The plant material used in the dissertation has been collected for the past 10 years mainly by my supervisors, prof. dr hab. Marcin Nobis and dr Polina Gudkova. Other contributors are noticed as collectors in Supplementary Tables S1 (Chapters 1 and 3) and in Supplementary Table S4 (Chapter 2). All morphological measurements were made under a light microscope SMZ800 (Nikon, Japan) and this data is available upon request. The raw sequence data related to the draft genome project was generated with the PacBio sequencing platform by SNPsaurus (Eugene, USA). All data, RAW and processed files as well as assembled genomes, present in the "Data availability" section under the respective Chapter 2. While Illumina data was produced by Macrogen Inc. (Seoul, South Korea) and is available upon request. The remaining articles (Chapters 1 and 3) include Illumina data processed by the proprietary DArT Pty Ltd. analytical pipelines (Canberra, Australia) and the RAW files cannot be requested. Nonetheless, the SNP data may be accessed in the genlight format upon request (Chapter 1) or via a link (Chapter 3, "Availability of data and materials"). Additionally, the peer review reports for the articles can be accessed upon request (Article 1 and 2) and online for Article 3 via a link t.ly/nsLK. Data analysis All morphometric and genetic data were processed and analysed using free and open-source software. The only exceptions are (1) a trial version of Geneious Prime v. 2021.1.1 (https://www.geneious.com) that was used in Chapter 2 for the chloroplast and mitochondrial genome annotations and a subsequent submission to the GenBank via a plugin implemented in the program and (2) ArcGIS Pro 2.7.1 (ESRI, Redlands, USA) that was used to create maps in Chapter 3. The remaining programs including R packages and their versions are listed in each chapter. 11 Chapter 1: Morphological and genome-wide evidence for natural hybridisation within the genus Stipa (Poaceae) The chapter includes the first published article on the usage of integrative taxonomy as a tool for revealing hybridisation events in Stipa. The novelty of the study consists in applying of a restriction-based high-throughput technique DArTseq for genome-wide SNP genotyping. Previously, this approach had not been used in studies on feather grasses. Moreover, here two types of data were tested: (1) co-dominant SNP markers obtained with DArTseq and (2) dominant SilicoDArT markers that merely represent the presence or absence of restriction fragments. Importantly, in the field, in the south shore of Lake Issyk-Kul (Kyrgyzstan), the putative hybrids selected for the analyses shared intermediate morphological characteristics between two parental species from the same locality, S. krylovii Roshev. and S. bungeana Trin. Thus, it was an appropriate set of taxa to test the DArTseq technique. Additionally, for the first time in the genus, a factor analysis of mixed data, which combines quantitative and qualitative characters, was used. 12 Article 1 Title: Morphological and genome-wide evidence for natural hybridisation within the genus Stipa (Poaceae) Journal: Scientific Reports, volume 10, article number: 13803 (2020) 2-year impact factor: 4.379 5-year impact factor: 5.133 The Ministry of Science and Higher Education of Poland: 140 points DOI: https://doi.org/10.1038/s41598-020-70582-1 13 www.nature.com/scientificreports SCIENTIFIC REPORTS naai reresearch H) Check for updates 0P‘ Morphological and genome-wide evidence for natural hybridisation within the genus Stipa (Poaceae) Evgenii Baiakhmetov1’2 , Arkadiusz Nowak3'4, Polina D. Gudkova2'5 & Marcin Nobis1 Hybridisation in the wild between closely related species is a common mechanism of speciation in the plant kingdom and, in particular, in the grass family. Here we explore the potential for natural hybridisation in Stipa (one of the largest genera in Poaceae) between genetically distant species at their distribution edges in Mountains of Central Asia using integrative taxonomy. Our research highlights the applicability of classical morphological and genome reduction approaches in studies on wild plant species. The obtained results revealed a new nothospecies, Stipa x lazkovii, which exhibits intermediate characters to S. krylovii and S. bungeana. A high-density DArTseq assay disclosed that S. x lazkovii is an FI hybrid, and established that the plastid and mitochondrial DNA was inherited from S. bungeana. In addition, molecular markers detected a hybridisation event between morphologically and genetically distant species S. bungeana and probably S. glareosa. Moreover, our findings demonstrated an uncertainty on the taxonomic status of S. bungeana that currently belongs to the section Leiostipa, but it is genetically closer to S. breviflora from the section Barbatae. Finally, we noticed a discrepancy between the current molecular data with the previous findings on S. capillata and S. sareptana. Hybridisation in the wild between closely related species is a common mechanism of speciation in the plant kingdom1 7. Due to the prevalence of polyploidy found in angiosperms it has been estimated that around 11% of flowering plants may have arisen through hybridisation events1. In addition, speciation via hybridisation can lead to an equal ploidy number within parental and newly formed species'. In general, hybridisation is often accompanied by introgression and causes gene transfer between species via repeated backcrossing4 “". On the one hand it may have contributed to species diversity and speciation'’'12-13, on the other, deleterious consequences of hybridisation such as decreased fitness, genetic assimilation and gene swamping may drive populations toward the brink of extinction14 16. In the grass family (Poaceae) hybridisation and introgression are well studied mainly for economically important plants, such as wheats17,18, maize19,20, rice21, **, barley23,24, oats25,26, rye27,28, sugarcanes2910, and sorghums31,32. Nowadays new molecular markers and technologies that first came to the field of agriculture are becoming widely used in studies of wild populations with little or no previous genomic information. For instance, genotyping-by-sequencing (GBS) and GBS-like approaches that were initially developed for maize and barley33 help to detect hybridisation and introgression events in many wild plant genera34-38. The genus Stipa L. belongs to the subfamily Pooideae and alongside with Bambusoideae (bamboos), and Oryzoideae (rices) form the so-called BOP clade39. The BOP species are known as the "cool season" or "pooid" grasses and all are C3 and distributed in temperate climates40. Following Tzvelev (1974), the genus Stipa includes six main sections Barbatae Junge, Leiostipa Dumort, Pseudoptilagrostis Tzvelev, Regelia Tzvelev, Stipa, and Smimo-via Tzvelev41, and comprises over 150 species native to Asia, Europe and North Africa42,43. In its strict sense, the genus is monophyletic44,45, but subdivisions within the genus are not consistently supported by available molecular data43,48. Species of the genus are dominants and/or subdominants in steppe plant communities47-50, can be used for their classification51, and in studies related to climate change52-54. Moreover, the species are of institute of Botany, Faculty of Biology, Jagiellonian University, Gronostajowa 3, 30-387 Kraków, Poland. 2Research Laboratory’Herbarium', National Research Tomsk State University, Lenin 36 Ave, 634050 Tomsk, Russia. 3Botanical Garden-Centre for Biological Diversity Conservation, Polish Academy of Sciences, Prawdziwka 2,02-973 Warszawa, Poland. 4lnstitute of Biology, Opole University, Oleska 22, 45-052 Opole, Poland, department of Biology, Altai State University, Lenin 61 Ave, 656049 Barnaul, Russia, email: evgenii.baiakhmetov@doctoral.uj.edu.pl; m.nobis@uj.edu.pl SCIENTIFIC REPORTS | (2020) 10:13803 | https://doi.org/10.1038/s41598-020-70582-l 14 www.nature.com/scientificreports/ SCIENTIFIC REPORTS | Figure I. Distribution map represents (a) general ranges of S. krylovii (green) and S. bungeana (red) with the dashed line indicating the hypothetical border, (b) localities of the examined specimens used for the molecular analysis. The current map is based on Google Maps. great economic importance mainly as pasture and fodder plants, especially in the early phases of development they can be used for soil remediation processes'6, and as ornamental plants (e.g. S. capillata L., S. pulcherrima K. Koch, S. pennata L.). For decades it has been hypothesised that some Stipa taxa arose via hybridisation’ "M. According to our observations, Stipa hybrids reproduce vegetatively and, less frequently, sexually60. It recently was shown that hybrids in Stipa can produce fertile pollen grains and therefore are able to backcross with both parental species61. In addition, based on morphology, a hybrid origin can be attributed to ca. 30% of Stipa species where only in Middle Asia 23 of 72 species are regarded as nothospecies43. For instance, to such taxa belong S. x czerepanovii Kotukhov (= S. orientalis Trin. x S. richteriana Kar. & Kir.); S. xfallax M. Nobis & A. Nowak (S. drobovii (Tzvel.) Czer.xS. macroglossa P. A. Smirn. subsp. macroglossa); S. xgegarkunii P. A. Smirn. (=S. caucasica Schmalh. xS. pulcherrima K. Koch); S. x hissarica M. Nobis (=S. lipskyi Roshev. x S. orientalis Trin.); S. x tzveleviana Kotukhov (= S. orientalis x S. macroglossa subsp. kazachstanica)-, and S. x zaissanica Kotukhov (= S. orientalis x S. hohenack-eriana Trin. & Rupr.)43,60,62,63. Heretofore, all putative hybrid taxa within Stipa were described based exclusively on morphological comparison. The only exception is Stipa x heptapotamica Golosk., whose origin has been established using molecular methods61. Although its parental species Stipa richteriana Kar. & Kir and S. lessingiana Trin. & Rupr. were morphologically distant and affiliated to different sections Leiostipa and Subbarbatae Tzvelev41,58,64, genetically they are closely related65,66 and able to hybridise with each other61. During field studies in eastern Kyrgyzstan in 2015 and 2017, interesting specimens of Stipa, combining characters not observed in the previously described taxa, were found on the south shore of Lake Issyk-Kul (Fig. 1). Due to these specimens seeming to be morphologically intermediate between two species from the same locality, we hypothesised that they can be hybrids between S. krylovii Roshev. and S. bungeana Trin. Although, traditionally both putative parental taxa were assigned to the section Leiostipa58, they are distant phylogeneti-cally and belong to two different clades61,65. Both of them have wide distribution ranges, Stipa krylovii occurs in the Russian Far East and Southern Siberia, Mongolia, China, Northern Nepal, Southern Tajikistan, Eastern Kazakhstan, and Eastern Kyrgyzstan43,67, whereas S. bungeana is distributed in Southern Mongolia, China, and Eastern Kyrgyzstan68,69 (Fig. 1 a). Since hybrids between genetically distant Stipa species have not been observed previously in nature, in the current study by using integrative taxonomy based on morphology and high density genome wide genotyping-by-sequencing data, we aim to (I) obtain insight into the extent of hybridisation between S. krylovii and S. bungeana on macro- and micromorphological levels; (2) assess levels of inter-species gene flow (if present) between the examined Stipa taxa; (3) analyse the usefulness of SilicoDArT and SNPs markers for genomic studies in Stipa. Results Numerical analysis. The factor analysis of mixed data (FAMD) revealed six markedly differentiated groups of OTUs in accordance with the taxonomic classification of the examined taxa (Fig. 2). The first three dimensions explained 41.71%, 13.64%, and 10.14%, of the total variability, respectively. The first dimension is composed, in order of descending contribution, by the quantitative variables AL, CollL, CL, LG, CvH (Supplementary Table S2, for character abbreviations see Table 1). The second dimension is composed, in order of descending contribution, by the quantitative variables DDL, LHTA, LiglV, WVS, LHD, SL, and the qualitative variable HTTA (Supplementary Table S2). The third dimension is composed, in order of descending contribution, by the quantitative variables HLCol2, HLColl, WColl, CBW, and the qualitative variable AdSVL (Supplementary Table S2). The two dimensional plot revealed the overlapping of OTUs belonging to S. breviflora and S. bungeana, whereas OTUs of S. sareptana are slightly overlapped with OTUs of S. krylovii and S. capillata (Fig. 2a). A clear dispersal of the OTUs could be seen in the three-dimensional plot, where differences between (2020)10:13803 | https://dol.org/10.1038/s41598-020-70582-l 15 www.nature.com/scientificreports/ SCIENTIFIC REPORTS | Figure 2. Factor analysis of mixed data performed on 22 quantitative and three qualitative characters of the six examined species of Stipa. (a) Plot of the two principal axes, (b) Plot of the three principal axes. The figure was created using the R-packages factoextra v.1.0.6 (Fig. a), https://CRAN.R-project.org/packagc=tactoextra/, and plotly v.4.9.2 (Fig. b), https://plotly.eom/r/getting-started/. Character Abbreviation Quantitative characters (mm) Width of blades of vegetative shoots WVS Length of ligulcs of the middle cauline leaves LigC Length of ligules of the internal vegetative shoots LiglV Length of lower glume 1X5 Length of anthecium AL Width of anthecium AW Length of callus CL Length of hairs on the dorsal part of callus CdH Length of hairs on the ventral part of callus CvH Length of callus base CBL Width of callus base CBW Length of hairs on the dorsal line on lemma LHD Length of hairs on the ventral line on lemma LHV Distance from the end of dorsal line of hairs to the top of lemma DDL Distance from the end of ventral line of hairs to the top of lemma DVL Length of hairs on the top of lemma LHTA Length of lower segment of awn CollL Length of middle segment of awn Col2L Length of seta SL Length of hairs on lower segment of awn HLColl Length of hairs on middle segment of awn HLC0I2 Width of lower segment of awn WColl Qualitative characters Character of abaxial surface of vegetative leaves (glabrous, with prickles) AbSVL Character of adaxial surface of vegetative leaves (short hairs, long hairs, mixed) AdSVL Type of hairs on the top of anthecium (glabrous, poor developed, well developed) HTTA Table 1. Morphological characters used in the present study. the studied species are explained by the third principal axis (Fig. 2b and in the interactive three-dimensional plot available at https://plot.ly/~eugenebayahmetov/3/). In particular, the third axis differentiates S. breviflora and S. bungeana as clear non-overlapped clouds of OTUs. In addition, the notch plots of variables showed significant differences between means and the strong evidence of differing medians within all the taxa for CL; AI, demonstrates the difference within all the taxa except the pair S. krylovii and S. sareptana; Col 11, exhibits the difference within all the taxa except the pair S. bungeana and S. breviflora; LG indicates the difference within all the taxa except the pairs S. capillata and S. sareptana, as well as S. breviflora and the putative hybrid (S. bungeana x S. krylovii), here and below named as S. x lazkovii-, the SL variable shows the difference within all the taxa except the pairs S. x lazkovii and S. capillata, and S. krylovii and S. sareptana (Supplementary Fig. SI). (2020)10:13803 I https://doi.org/10.1038/s41598-020-70582-l 16 www.nature.com/scientificreports/ Figure 3. Micromorphological patterns of Stipa krylovii (a-e), S. x lazkovii (f-j) and S. bungeana (k-o): top of lemma (a, f, k), lemma abaxial surface (b-c, g-h, 1-m), adaxial surface of leaf blade (d, i, n), abaxial surface of leaf blade (e, j, o). Abbreviations: h - hooks, lc - long cells; mh - macrohairs, pr - prickles; sb - silica bodies. Seven notch plots of variables show significant differences between means and the strong evidence of differing medians within S. bungeana, S. krylovii, and their putative hybrid: AL, CL, CollL, SL, WColl, LG, and WVS (Supplementary Fig. SI). At the same time, S. bungeana, S. krylovii and S. xlazkovii share six characters that have no significant differences between their means: CvH, CBL, CBW, DVL, HLColl, and HLCol2. Further, S. x lazkovii and S. krylovii share seven characters with no significant differences between their means, but differ with S. bungeana: CdH, DDL, Col2L, LigC, LiglV, AW, LHTA. Finally, only two characters LHD and LHV have no significant differences between means within pairs S. x lazkovii and S. krylovii, and S. x lazkovii and S. bungeana, but have significant differences between means of S. krylovii and S. bungeana (Supplementary Fig. SI). Micromorphology. The micromorphological examination of Stipa bungeana, S. krylovii and their putative hybrid revealed the pattern of lemma that is typical for the genus Stipa (Fig. .i)45 '’0-62-70'71. In all three taxa, the fundamental long cells are rectangular to more or less square in shape. The side walls of long cells are raised and undulate. Silica bodies are sparse or absent, but if present, they are reniform to ovate, whereas cork cells are absent. Hooks are frequent and oriented towards the lemma apex, whereas prickles are present mostly near the lemma apex (Fig. 3). Macrohairs are straight or bent near the base, cylindrical and/or string-like, with a bulbous base and a needle-like apex. They are organised in seven lines. The lemma apex is scabrous due to abundant hooks, prickles and short macrohairs (present especially in S. bungeana and in the hybrid), surpassed by a ring of unequal macrohairs. The pattern of lemma apex shows clearly intermediate character of S. x lazkovii between the two putative parents (Figs. 3a, 3f, 3k). SCIENTIFIC REPORTS | (2020)10:13803 | https://doi.org/10.1038/s41598-020-70582-l 17 www.nature.com/scientificreports/ Figure 4. Principal Coordinates Analysis plot based on genetic distances between samples, (a) Plot of the two principal axes based on SilicoDArT markers, (b) Plot of the three principal axes based on SilicoDArT markers, (c) Plot of the two principal axes based on SNPs markers, (d) Plot of the three principal axes based on SNPs markers. The figure was created using the R-packages ggplot2 v.3.3.0 (Figs a and c), https://ggplot2.tidyverse. org/, and plotly v.4.9.2 (Figs b and d), https://plotly.eom/r/getting-started/. DArTseq analysis. A total of 137,437 SilicoDArT and 125,850 SNPs markers were obtained using a DArT-seq high-density assay, of which 76,604 SilicoDArT and 19,133 SNPs markers were kept after the filtering steps. The first two axes of principal coordinates analysis (PCoA) explained 77% and 91% of the total genetic divergence within the studied taxa based on the SilicoDArT and SNPs markers, respectively, whereas the third axes explained only 6.3% and 3% (Fig. 4). In general, based on genetic similarities both markers revealed six markedly differentiated groups (Fig. 4). Most of the specimens are grouped together accordingly to their taxonomica! classifications. However, one sample (ID0494394), which morphologically was somewhat similar to S. breviflora, is grouping together with S. bungeana OTUs and far distant to the rest of OTUs belonging to S. breviflora. All S. x lazkovii specimens have an intermediate position between S. bungeana and S. krylovii, suggesting an admixed origin. In addition, on the basis of two axes both markers are not allowed to differentiate two taxa, S. capillata and S. sareptana. On the other hand, the difference can be marked in the three-dimensional plot based on SilicoDArT markers (Figs. 4b, the interactive plot available at https://plot.ly/~eugenebayahmetov/5/), but not in SNPs markers (Fig. 4d, https ://plot.ly/~eugenebayahmetov/7/). SCIENTIFIC REPORTS | (2020)10:13803 | https://doi.org/10.1038/s41598-020-70582-l 18 www.nature.com/scientificreports/ SCIENTIFIC REPORTS | Figure 5. FastSTRUCTURE results based on (a) SilicoDArT markers for K = 5 and (b) SNPs markers for K=4. The figure was created using an in-house R script in RStudio v.l.1.463, https://rstudio.com/products/rstudio/. A fastSTRUCTURE analysis of the SilicoDArT markers revealed the most likely number of clusters at K value of 5 (Fig. 5a). For the SNPs markers, the 'best’ K was inferred in fastSTRUCTURE as K=4 (Fig. 5b). Both analyses defined S. breviflora, S. bungeana, and S. krylovii as clear taxa with the exception of the specimen ID0494394 (Fig. 5) that shares 73% of markers with S. bungeana and 27% with probably S. glareosa, indicating their first backcross generation progeny (Fig. 5a). The last-mentioned taxon was not present in the analyses, however, it is common in the locality, where the specimen 1D0494394 was growing. In case of SNPs markers, the specimen 1D0494394 has 75% of markers with S. bungeana, 19% with S. capillata/S. sareptana, and 6% with S. krylovii, suggesting a possible hybridisation between these species followed by backcrossing with S. bungeana (Fig. 5b). The fastSTRUCTURE analyses revealed FI hybrid specimens between S. krylovii and S. bungeana due to samples of S. x lazkovii have admixture between these clusters in a range of 55% and 45% for the SilicoDArT markers (Fig. 5a), and 50/50% for the SNPs markers (Fig. 5b). The fastSTRUCTURE output for the SilicoDArT markers exhibits no difference between S. capillata and S. sareptana resulting in clustering them together (Fig. 5a), whereas the analysis of the SNPs shows an admixture between S. capillata/S. sareptana and S. krylovii in a range of 53% and 47%, respectively (Fig. 5b). The results of the UPGMA cluster analyses revealed a clear division of samples into two major clades (Fig. 6). According to the clustering obtained with the SilicoDArT markers, the first clade is subdivided into four smaller clusters, specifically, comprising samples of: (1) S. x lazkovii-, (2) S. krylovii-, (3) S. sareptana; (4) S. capillata (Fig. 6a). The first two species are genetically closely related to each other and distant to S. sareptana and S. (2020)10:13803 | https://doi.org/10.1038/s41598-020-70582-l 19 www.nature.com/scientificreports/ SCIENTIFIC REPORTS I Figure 6. Unweighted Pair Group Method with Arithmetic Mean cluster analyses based on Jaccard’s similarity coefficients generated from (a) SilicoDArT markers and (b) SNPs markers. The figure was created using the R-package stats v.3.6.2, https://www.rdocumentation.Org/packages/stats/versions/3.6.2/. capillata that together form one sub-cluster. The second clade is composed of three clusters comprising samples of: (1) S. breviflora; (2) S. bungeana; (3) the sample ID0494394 that is genetically closer to S. bungeana than to S. breviflora. The UPGMA cluster analysis of the SNPs markers demonstrated the subdivision of samples into the same number of clusters as were obtained for the SilicoDArT markers. However, in this case, specimens of S. x lazkovii are genetically closer to S. bungeana, but not to S. krylovii. Genetic mapping onto chloroplast genomes of Stipa species and mitochondria of specimens from the Poaceae family (Supplementary Table S3) revealed 11 SilicoDArT markers assigned to chloroplast DNA and 27 loci assigned to mitochondrial DNA. The downstream neighbour-joining cluster analysis showed grouping of Stipa taxa into two main clades (Fig. 7). In the first clade three species could be defined: S. krylovii, S. capillata (bootstrap support 90%), S. sareptana (bootstrap support 84%), with an exception of the specimen ID0494394 that is grouped together with S. krylovii (Fig. 7). The second clade comprises a group of S. bungeana and S. x lazkovii (bootstrap support 79%), and the rest of S. breviflora specimens with a good bootstrap support of 87% (Fig. 7). All S. x lazkovii samples are grouping alongside with S. bungeana, and one S. bungeana specimen (ID0459867) is placed outside the main group of S. bungeana and S. x lazkovii with a bootstrap support of 79% (Fig. 7). (2020)10:13803 I https://doi.org/10.1038/s41598-020-70582-l 20 www.nature.com/scientificreports/ SCIENTIFIC REPORTS | Figure 7. Neighbor-joining tree reconstructed based on the SilicoDArT markers derived from chloroplast and mitochondrial genomes. The bootstrap values > 50% obtained from 10,000 replicates are shown above the branches. The figure was created using Figtree vl .4.4, https://tree.bio.ed.ac.uk/software/figtree/. Discussion Although many interspecific hybrids have been described in the genus Stipa41-'7 ^, so far only a single molecular investigation was performed to verify the origin of one such species, Stipa x heptapotamicii' that appeared to be a hybrid between genetically closely related species'"'6'’. The current study is the first report of hybridisation between two genetically distant Stipa species, S. krylovii and S. bungeimrf''-'", at their distribution edges in Mountains of Central Asia (Fig. la). Analyses of morphological variation resulted in a clear delimitation of the studied species (Fig. 2b). Particularly, the main morphological characters (Table 1) show that species S. capillata, S. sareptana, S. krylovii, S. bungeana, S. x lazkovii representing the section Leiostipa are quite distant to S. breviflora which traditionally has been affiliated to the section Barbatae4'. As expected, the hybrid specimens of S. x lazkovii were mostly characterised by intermediate morphological traits between the parental taxa S. krylovii and S. bungeana (Figs. 2 and 3, Supplementary Fig. S1). In addition, some OTUs of S. sareptana were slightly overlapped with OTUs of S. krylovii and S. capillata. However, S. sareptana and S. krylovii are easy to distinguish based on morphology of leaves (scabrous in S. sareptana and glabrous in S. krylovii) and the lemma apex (with a poorly developed ring of hairs in S. sareptana and with a well-developed ring of hairs in S. krylovii)4™**'7. As for S. sareptana and S. capillata, these taxa can be delimited by characteristics of their vegetative leaves (scabrous in S. sareptana and glabrous in S. capillata) and characters of lemma (hairs on the top in S. sareptana and glabrous in S. capillata)4™**7. Both PCoA and fastSTRUCTURE analyses confirmed that S. x lazkovii is the FI hybrid of S. krylovii and S. bungeana (Figs. 4 and 5). In addition, the neighbour-joining cluster analysis identified S. bungeana as the source of maternal DNA for all hybrid specimens, suggesting unidirectional hybridisation (Fig. 7). However, due to the small sample size, we cannot exclude either an opposite combination or interspecific gene flow through introgression that could exist in nature, especially since in this area of Issyk-Kul Lake populations of both parental species are extremely large. Although morphologically S. bungeana is considered as a member of the section Leiostipa '*, molecular analyses demonstrated that it is quite distant from S. krylovii, S. capillata, and S. sareptana from the same section (Figs. 4, 6 and 7). These findings support our previous molecular results for these taxa based on a nuclear region61,65. In addition, the results of the distance based clustering algorithms UPGMA and NJ revealed that S. bungeana is closer to S. breviflora then to the rest of Leiostipa taxa from the study (Figs. 6 and 7). This result demands further investigations on S. bungeana to establish its proper taxonomic place in the genus Stipa. Analyses of molecular markers also revealed that the genetic relationships within some studied taxa are more complex than expected. Firstly, the sample ID0494394, which morphologically was somewhat similar to S. breviflora, appeared to be an introgressive hybrid that shares 73% of markers with S. bungeana and 27%, more likely, with S. glareosa (Fig. 5a). Here, we presume that hybridisation events are happening between S. bungeana (2020)10:13803 | https://doi.org/10.1038/s41598-020-70582-l 21 www.nature.com/scientificreports/ SCIENTIFIC REPORTS | and S. glareosa, because the introgressive hybrid was found on the north shore of Lake Issyk-Kul, where only three Stipa taxa were recorded (S. bungeana, S. breviflora, and S. glareosa). However, due to S. glareosa was absent in the analyses, a new study focusing on hybridisation should be performed to verify if the gene flow is a common event within these taxa. Secondly, our research demonstrates the discordance between the results of fastStructure and PCoA analyses from one side and the UPGMA and N| from the other. The first two represent no or almost no difference between S. capillata and S. sareptana (Figs. 4 and 5). Notwithstanding, the UPGMA and NJ dendrograms show that genetically these taxa can be delimited (Figs. 6 and 7) that supports our previous molecular investigations on these taxa61-65. However, in the current research S. capillata and S. sareptana are grouped together, whereas based on the nuclear Intergenic Spacer (IGS) the last taxon is closer to S. krylovii*',65. Due to the limited number of analysed specimens in the present and previous studies, we believe that a bigger sample size combining genetics and traditional taxonomy should be undertaken in order to better resolve the relationship between these species. Until now the DArTseq approach has been used mostly in commercially important plant species72""“ and its implication in genomic studies in wild species is still limited'9 “1. Thus, the current study highlights the applicability of genome reduction approaches such as DArTseq in studies on natural hybridisation in wild, and specifically in a grass genus Stipa. The high density genome wide genotyping-by-sequencing resulted in a total of 137,437 SilicoDArT and 125,850 SNPs markers, of which 76,604 SilicoDArT and 19,133 SNPs provided robust information of the Stipa genome in the absence of the reference sequence information. Such number of markers is several 100-fold higher than was achieved in our previous study on natural hybridisation in Stipa"'. In particular, by using inter simple sequence repeat markers (ISSR) we were able to detect only 105 polymorphic bands for the S. heptapotamica hybrid complex. In addition, dominant markers were used in several genomic studies in Stipa and resulted in 372 polymorphic ISSR bands for S. bungeana*', 34 polymorphic ISSR bands for S. ucrainica and S. zalesskii*2, 212 polymorphic ISSR bands for S. tenacissinia*\ 231 polymorphic random amplified polymorphic DNA (RAPD) bands for S. krylovii**, 310 polymorphic RAPD bands for S. grandis*5, and 504 polymorphic sequence-related amplified polymorphism bands for S. bungeana*'. Thus, both SilicoDArT and SNPs markers may better suit for genetic diversity studies in Stipa. Furthermore, the current study demonstrated the usefulness of SilicoDArT markers as a tool to detect chloroplast and mitochondrial loci and thus may help to clarify the maternal inheritance of hybrid species. Taxonomic treatment. Stipa x lazkovii M. Nobis 8c A. Nowak, nothosp. nov. (Fig. 3f-j, Supplementary Figs S2 and S3). TYPE: Kyrgyzstan, between Kongurlen and Kultor, 17 km SW from coast of Issyk-Kul, semidesert, N 42°5'47.07'’ / E 76°39'6.22'’, elev. 1940 m, wp. 930, 6 July 2017, M. Nobis, E. Klichowska, A. Wróbel, A. Nowak sn. (holotype KRA 495,093! (specimen in the middle part of the sheet); isotypes KRA 487,067!, 487,066!, 481,608!). Diagnosis: Stipa x lazkovii differs from S. krylovii Roshev. by having shorter anthecium (7.3-8.5 mm vs. 9.0-11.5), shorter callus (1.8-2.2 vs. 2.3-3.8 mm long), shorter glumes (15-17 vs. 18-28 mm long) as well as by having long prickle-hairs below the top of the anthecium (Fig. 3). Having long prickle-hairs below the top of the anthecium Stipa x lazkovii is also similar to S. bungeana, however differs from it by longer anthecium (7.3—8.5 vs. 4.8-6.0 mm long), longer callus (over 1.8 vs. up to t.3 mm long), longer glumes (over 15 vs. up to 15 mm long) and narrower leaves (0.5-0.6 vs. 0.6-1.0 mm wide). Description: Plants perennial, densely tufted, with a few culms and numerous vegetative shoots; culms 35-55 cm tall, 3-noded, glabrous at and below the nodes. Leaves of vegetative shoots: sheaths glabrous, at margins ciliate; ligules truncate, up to 0.2 mm ciliate at margins; blades convolute, up to 25 cm long, 0.5-0.6(-0.7) mm in diameter, adaxial surface densely pubescent with up to 0.1 mm long hairs (prickles), adaxial surface glabrous, rarely very slightly scabrous. Cauline leaves: sheaths glabrous and with white edge, shorter than internodes; ligules 0.5-5 mm long, acute and glabrous; blades glabrous, up to 12 cm long. Panicle up to 25 cm long contracted, at base enclosed by sheath of uppermost leaf, branches erect, setulose, single or paired. Glumes subequal, 15-25 mm long, narrowly lanceolate, tapering into long hyaline apex. Anthecium 7.3-8.5 mm long and 0.7-0.9 mm wide. Callus 1.8-2.2 mm long, densely pilose on ventral and dorsal surfaces, callus base acute, cuneate, scar elliptic. Lemma pale green, on dorsal surface with abundant hooks and with 7 lines of ascending hairs, hairs up to 0.5 mm long, ventral line of hairs terminates at 1.3-1.7 mm below top oflemma and dorsal line terminates at 1.5-2.2 mm below top oflemma; top oflemma scabrous due to hooks and prickles and at apex with a ring of hairs up to 0.5 mm long. Palea equals to lemma in length. Awn 95-118 mm long, bigeniculate; lower segment of column 19-25 mm long, twisted, scabrous due to prickles and short hairs up to 0.15; upper segment of column 11.5-13 mm long, twisted, scabrous due to prickles and short hairs up to 0.2 mm in long; seta flexu-ous 65-80 mm long, hairs in the lower part of the seta 0.1 -0.2 mm long, gradually decreasing in length towards apex. Anthers yellow, 4-5 mm long, glabrous. Etymology: The name of the taxon honours prof. dr Georgy A. Lazkov (Academy of Sciences, Bishkek, Kyrgyzstan), the eminent botanist, taxonomists and expert of vascular plants of Middle Asian Mountains. Other specimens studied (paratypes): Kyrgyzstan, western Tian-Shan, Kongurlen Valley, steppe grasslands near the road, 3 km E of Kongurlen settl., to the S of SW part of Issyk-Kul Lake, N 42°5'53.97" / E 76°38'37.28'’ elev. 1945 m, wp. 644, 10 July 2015, M. Nobis, A. Nowak sn. (KRA 476,871,476,870!, 476,869!, WA!). An identification key to central Asian species of Stipa that have scabrous awns or awns that are throughout covered by 0.1-0.3 mm long hairs is given in Supplementary S4. (2020)10:13803 I https://doi.org/10.1038/s41598-020-70582-l 22 www.nature.com/scientificreports/ SCIENTIFIC REPORTS | Materials and methods Plant material. Morphological examination is based on plant specimens deposited in the KRA herbarium (the acronym from Thiers86). In total, 188 fully developed Stipa samples were studied under a light microscope SMZ800 (Nikon, Japan) including 40 specimens of S. krylovii, 40 of S. bungeana, 6 of S. x lazkovii, 22 of S. brevi-flora, 40 of S. capillata, and 40 of S. sareptana. For molecular analysis, we collected leaves of plants from localities where S. krylovii and S. bungeana grow together with their putative hybrid, as well as from areas where S. krylovii and S. bungeana grow separately from each other (Fig. lb). Additionally, we included Stipa taxa that frequently occur in the area near of Issyk-Kul Lake. In total, we selected 20 specimens of S. krylovii, 20 specimens of S. bungeana, 6 specimens of S. x lazkovii, 10 specimens of S. breviflora, 2 specimens of S. capillata, and 2 specimens of S. sareptana. Only one taxon, S. glareosa, is not presented in the study due to it was not found in the locality of S. x lazkovii. Moreover, S. glareosa belongs to the section Smirnovia41 and exhibits unique characters (e.g. long and pilose awns with a single geniculation), which were not observed in any Stipa taxa in this region. All voucher specimens used in the molecular analysis are preserved at KRA (Supplementary Table SI). The names of plants were adopted from the WCSP87. Macromorphologicai analyses. For the morphometric analyses, 188 specimens were used as operational taxonomic units (OTUs)88. As a first step, the Shapiro-Wilk test was used in the R-package MVN89 to assess the normality of the distribution of each character. The non-parametric Spearmans correlation coefficient was used in the R-package MVN to examine relations between the studied characters. The 22 most informative quantitative and three qualitative morphological characters, commonly used in keys and taxonomic descriptions were chosen for the analyses (Table 1). A Factor Analysis of Mixed Data (FAMD)9" was performed in the R-package FactoMineR91 to characterise variation within and among groups of taxa without a priori taxonomic classification and to extract the variables that best identified them. The number of principal components included in the analysis was chosen based on Screes test9-’. The R-package factoextra9’ was used to visualise the first two components, whereas the R-package plotly94 was chosen to illustrate the first three. Notch plots were created in the R-package ggplot295 to explore distributional relationships between each response variable and the studied taxa (Supplementary Fig. SI). The notched box plots display a confidence interval around the median, which is normally based on the median ± 1.57 x interquartile range/square root of n. According to this graphical method for data analysis, if the notches of the two boxes do not overlap, there is "strong evidence" (95% confidence) that their medians differ. Additionally, to reveal significant differences between means of particular characters across all examined taxa the nonparametric Kruskal-Wallis test followed by the Wilcoxon rank sum test for post hoc group comparisons were calculated. To address the multiplicity of comparison, the Bonferroni method was applied to calculate corrected p-values. Micromorphological examination. The lemma and lamina micromorphology within Stipa x lazkovii, S. krylovii, and S. bungeana were examined using scanning electron microscopy (SEM). The dried samples were coated with a gold layer using a Quorum Q150R S coater (Quorum, UK). The SEM images were obtained by a scanning electron microscope S-4700 (Hitachi, Japan). Further, we examined the adaxial and abaxial surfaces of lamina, and five sets of diagnostic characters oflemma micromorphology: (1) long cells, (2) silica bodies, (3) hooks, (4) prickles, (5) macrohairs. DNA extraction, amplification, and DArT sequencing. Isolation of genomic DNA was performed from dried leaf tissues using a Genomic Mini AX Plant Kit (A&A Biotechnology, Poland). Quality check, quantification and concentration adjustment for sequencing and genotyping were accomplished using a NanoDrop One (Thermo Scientific, USA) and agarose gel electrophoresis visualisation. The concentration of each sample was adjusted to 50 ng/pL. Purified DNA samples (1 pg for each sample) were sent to Diversity Arrays Technology Pty Ltd (Canberra, Australia) for sequencing and marker identification. DArTseq represents a combination of a DArT complexity reduction methods and next generation sequencing platforms96100. The technology is optimised for each organism and application in order to select the most appropriate complexity reduction method (both the size of the representation and the fraction of a genome selected for assays). Based on testing several enzyme combinations for complexity reduction Diversity Arrays Technology Pty Ltd selected the Pstl-Msel method for Stipa. DNA samples were processed in digestion/ligation reactions as described previously97, but replacing a single Pstl-compatible adaptor with two different adaptors corresponding to two different Restriction Enzyme (RE) overhangs. The Pstl-compatible adapter was designed to include Illumina llowcell attachment sequence, sequencing primer sequence and "staggered", varying length barcode region, similar to the sequence previously reported". Reverse adapter contained flowcell attachment region and Msel-compatible overhang sequence. Only "mixed fragments" (Pstl-Msel) were effectively amplified by PCR using an initial denaturation step of 94 °C for 1 niin, followed by 30 cycles with the following temperature profile: denaturation at 94 °C for 20 s, annealing at 58 °C for 30 s and extension at 72 °C for 45 s, with an additional final extension at 72 °C for 7 min. After PCR equimolar amounts of amplification products from each sample of the 96-well microtiter plate were bulked and applied to c-Bot (Illumina, USA) bridge PCR followed by sequencing on Hiseq2500 (Illumina, USA). The sequencing (single read) was run for 77 cycles. Sequences generated from each lane were processed using proprietary DArT analytical pipelines. In the primary pipeline, the fastq files were first processed to filter away poor quality sequences, applying more stringent selection criteria to the barcode region compared to the rest of the sequence. In that way the assignments of (2020)10:13803 | https://doi.org/10.1038/s41598-020-70582-l 23 www.nature.com/scientificreports/ SCIENTIFIC REPORTS | the sequences to specific samples carried in the "barcode split" step were very reliable. Approximately 2.5 mln sequences per barcode/sample were identified and used in marker calling. DArTseq data analysis. DArTseq produce two types of data: (1) co-dominant single nucleotide polymorphisms (SNPs) markers, and (2) dominant SilicoDArT markers that represent the presence or absence of restriction fragments. All molecular analyses with the DArTseq data (SNPs and SilicoDArT) sets were performed after filtering steps in the R-package dartR101 with the following parameters: (1) a scoring reproducibility of 100%, (2) at least 95% loci called (the respective DNA fragment had been identified (= called) in greater than 95% of all individuals), (3) monomorphic loci were removed, (4) SNPs that shared secondaries (had more than one sequence tag represented in the dataset) were randomly filtered out to keep only one random sequence tag. Three approaches were used to analyse genetic structure of the studied taxa: ( 1 ) Principal Coordinates Analysis (PCoA), (2) fastSTRUCTURE analysis, and (3) Unweighted Pair Group Method with Arithmetic Mean (UPGMA). The PCoA analyses based on Euclidean distance matrices were performed using R-packages dartR and visualised by using ggplot2 to show the first two components, and plotly to illustrate the first three components. Genetic structure was then investigated using the fastSTRUCTURE software, which implements the Bayesian clustering algorithm STRUCTURE, assuming Hardy-Weinberg equilibrium between alleles, in a fast and resource-efficient manner102. A number of clusters (K-values) ranging from 2 to 10 were tested using the default convergence criterion of 10 6 and priors. The most likely K-value was estimated with the best choice function implemented in fastSTRUCTURE. In case of a range of K values, the true K was determined as a value between the estimates predicted by fastSTRUCTURE and based on what made most biological sense. The output matrices for the best K-values were reordered and plotted using an in-house R script in RStudio (Version 1.1.463)10'. The threshold of 0.106'36.66" E 76°47'18.16" 2040 m 10.07.2015 M. Nobis. S kryfovn 0470573 to the S of SWr part of Lake Issyk-Kul. A.Nowak 5 km of Kongurlcn settl S.bungeana 0487058 Kyrgyzstan. N 42° 7 6.49" E 77° 0*55.91" 1799 m 01.08.2016 M.Nobis. SW pari of Lake Issyk-Kul. A.Nobis Bokonbaycvo Skryfovn 0469167 Kyrgyzstan. N 42°8'I0.30" E76°48' 11.84" 1900 m 01 08.2016 M. Nobis. S. kryfovii 0469168 SW part of Lake Issyk-Kul. A.Nobis S. breviflora 0469180 15 km W of Bokonbaycvo S breviflora 0469186 S. kryfovii 0496246 Kyrgyzstan. N 42°8'I5.86" E76°48* 18.82" 1894 m 04.07.2018 M. Nobis. S breviflora 0496247 SW part of Lake Issyk-Kul. E.KIichowska. S breviflora 0496248 ca. 15.5 km W of Bokonbaycvo A.Wr6bcl. A.Nowak Skryfovn 0469188 Kyrgyzstan. N 42°K’26.50" E 76°45'25.55" 2030 m 1.08.2016 M.Nobis. S kryfovn 0469195 SW part of Lake Issyk-Kul. A.Nobis S. kryfovii 0469202 20 km W of Bokonbayevo S. kryfovii 0469194 S. kryfovii 0469181 Skryfovn 0469189 S. capiflata 0475125 Kyrgyzstan. N 42° 10*43.78" E 77° 18*24.06" 1612 m 01.08.2016 M.Nobis. S part of I .akc Issyk-Kul. A.Nobis ca. 20 km W of Barskoon S bungeana 0459867 Kyrgyzstan. N 42° 11*47" E 77°39'00" 1600 m 16.06.2013 M.Nobis. S part of l.ake Issyk-Kul A.Nowak S. bungeana 0459866 Kyrgyzstan, N 42°2I'26" E 76°03'26" 1680 m 16.06.2013 M.Nobis. S. breviflora 0494393 15 km W of Lake Issyk-Kul. A.Nowak 5 km NE of E part of Orto-Tokoy Reservoir 29 Supplementary Table S2. Contribution (%) by dimension of each character in FAMD. Character Dim. 1 Dim. 2 Dim. 3 AL 8.138459 0.25128967 0.343414 CollL 7.892034 0.01385704 0.141086 CL 7.184994 0.52827477 0.549433 LG 6.912585 1.38953931 0.750192 CvH 5.994953 0.0076572 0.247512 HTTA 5.13036 9.05892151 1.898838 Col2L 5.095023 2.29357402 1.600668 CdH 5.059659 2.39238771 0.03297 AW 4.951688 1.11235111 0.018848 CBL 4.626069 0.80414294 2.087042 SL 4.512454 8.00889469 0.005742 WColl 4.290667 0.83702145 10.24068 LigC 4.10087 5.79251874 0.33842 CBW 3.464594 0.3523899 8.816285 LHTA 3.265194 10.8351853 3.145308 LiglV 2.585415 10.0621745 0.231904 AdSVL 2.48356 2.8608142 11.89666 AbSVL 2.450702 0.0135254 0.29022 HLCol2 2.358535 0.1142349 24.57377 HLColl 2.323692 0.06843165 24.27274 DVL 2.246823 6.478324 3.44793 DDL 1.845006 11.965472 1.271305 I.HD 1.18748 9.47148499 0.003277 WVS 1.141917 9.95174283 3.469934 LHV 0.757266 5.33579015 0.325828 30 Supplementary Table S3. Species names and GenBank accession numbers for the sequences used in this study. Taxon Genetic compartments Sequence length (bp) GB accession number Stipa richteriana Chloroplast 137 831 MG052612.1 Stipa lipskyi Chloroplast 137 854 KT692644.1 Stipa purpurea Chloroplast 137 370 NC 029390.1 Stipa ovczinnikovii Chloroplast 137 874 NC 037034.1 Stipa jagnobica Chloroplast 137 827 NC 037029.1 Stipa holienackeiiaiia Chloroplast 137 753 NC_037028.1 Stipa narynica Chloroplast 137 854 NC 037032.1 Stipa magnifica Chloroplast 137 848 NC 037031.1 Stipa arabica Chloroplast 137 757 NC 037024.1 Stipa orientalis Chloroplast 137 822 NC_03 7033.1 Stipa lessiitgiana Chloroplast 137 829 NC_037030.1 Stipa caucásica Chloroplast 137 798 NC_037027.1 Stipa capillata Chloroplast 137 830 NC_037026.1 Stipa boiystlienica Chloroplast 137 825 NC_037025.1 Stipa zalesskii Chloroplast 137 836 NC_03 7037.1 Tripsacum dactyloides Mitochondrion 704 100 NC^008362.1 Hordeum vulgare Mitochondrion 525 599 AP017300.1 Zea mays Mitochondrion 569 630 NC 007982.1 Triticum aestivum Mitochondrion 452 526 MH051716.1 Eleusine indica Mitochondrion 520 691 NC_040989.1 Sorghum bicolor Mitochondrion 468 628 NC 008360.1 Onza sativa Mitochondrion 637 692 JF281153.1 Aegilops speltoides Mitochondrion 476 091 NC 022666.1 AUoteropsis seminima Mitochondrion 442 063 MH644808.1 Loliutn perenne Mitochondrion 678 580 JX999996.1 Saccharum offieinarum Mitochondrion 300 784 NC_031164.1 Saccharum offieinarum Mitochondrion 144 698 LC 107875.1 31 Supplementary S4. Identification key to central Asian species of Stipa that have scabrous awns or awns that are throughout covered by 0.1-0.3 mm long hairs (shorter than a diameter of the awn). 1. Glumes 9-15 mm long, callus 0.5-1.3 mm long, autliecium 5-7 mm long......................................................................2 - Glumes 15-35 mm long, callus 1.5-4 mm long, antheciiuu 7-14 mm long.......................................................................3 2. Callus 1—1.3 nun long, awn with hairs up to 0.1 nun long, ligules of vegetative shoots 0.2-0.5 nun long, lemma with 7 lines of hairs, of which the dorsal one terminates below the half of the lemma length .................................................................................................S. bungeana Trin. - Callus (0.5-)0.6-0.8(-l) nun long, awn with hairs 0.2-0.3 nun long, ligules of vegetative shoots up to 0.2 nun long densely hairy on margins, leuuna with indistinct lines of hairs or all-around pilose, hairs terminate in the upper half of the leuuna length.........................................S. ricliteriana subsp jagnobica (Ovcz. & Czuk.) Tzvelev 3. Anthecium with well developed, dense ring of hairs at the apex...........................................................................4 - Anthecium without or with poorly developed ring of hairs at the apex......................................................................8 4. Abaxial (lower) surface of blades of vegetative leaves usually glabrous, rarely slightly scabrous, adaxial (upper) surface densely covered with hairs up to 0.1 mm long....................................................................................5 - Abaxial surface of blades of vegetative leaves usually scabrous or glabrous, adaxial surface densely covered with hairs 0.2-0.5 nun long, or a mixture of shorter and longer hairs....................................................................7 5. Ligules of vegetative leaves 0.3-2 nun long, callus 1.5-2 mm long.......................S. margelaiilca P. A. Smim. - Ligules of vegetative leaves up to 0.2 mm long............................................................................................6 6. Callus (2.1—)2.3—3.8(—4.1) nun long, anthecium 9.0-11.5 nun long, glumes 18-28 nun long ............................. S. krylovii Roshev. - Callus 1.8-2.2 nun long, anthecium 7.3-8.5 mm, glumes 15-17 S. x lazkovii M. Nobis & A. Nowak 7. Column 5-9(-10) cm long, anthecium 14—16.5(—17.5) nun. and aw'n (18-)22-27(-30) an, middle strip of hairs on the lennna extends up to its lower 1/4-1 /2 length, abaxial surface of leaves glabrous and smooth ..........................................................................................S. grandis P. A. Smirn. - Column less than 3-4.5 cm long, anthecium 13-14 mm. awrn 13-17 cm. middle strip of hairs on the lemma extends above its 1/2 length, abaxial surface of leaves scabrous or glabrous..............S. baicalensis Roshev. 8. Lemma with poorly developed ring of hairs at the apex, abaxial surface of leaves of vegetative shoots scabrous due to prickles and spinules and adaxial covered with a mix of short and long hairs (long hairs present only on marginal ribs)................................................................................S. sareptana A. K. Becker - Lemma glabrous at the apex or rarely with scattered hairs and/or prickles near the lemma margins, abaxial surface of leaves of vegetative shoots scabrous to almost glabrous and adaxial densely covered with 0.2-0.5 nun long hairs (rarely, long hairs are present only on marginal ribs)...............................................................9 9. Anthecium 7-11 nun long, awn 7-11 cm long, intemodes usually longer than culm sheaths .......... .......................................................................................S. karakabinica Kotukhov - Anthecium (10—)11—13(—14) mm long, awrn ( 10—) 12—22 cm long, intemodes usually shorter titan culm sheaths .....................................................................................................SL capillala L 32 33 34 35 36 Supplementary Figure SI Notched boxplot demonstrating the mean (while circle), the median (dark black line), 95** confidence interval around the median (notch), intcr-quartik ranges (25% to 75%). whisker» (5% and 95%). and minimum and maximum measurements (crosses) of quantitative characters (a-v) for the studied specie». Statistical significance was tested by Wilcoxon rank-sum test for post hoc group comparisons with Bonl'cmmi correction, p < 0.001, p < 0.01. p < 0.05. p < 0.1, and p< I noted as and no symbol, respectively. Each dot represents an observation. 37 38 Chapter 2: The first draft genome of feather grasses using SMRT sequencing and its implications in molecular studies of Stipa The chapter includes the second published article on the first draft genome of feather grasses. The novelty of the study consists of applying a third-generation sequencing technology (the PacBio platform) to assemble nuclear, chloroplast and mitochondrial genomes and to provide new genetic data for further works on phylogeny, hybridisation and population studies within Stipa and the grass family Poaceae. Importantly, new data helped to investigate the evolutionary history of the genus. Previously, the earliest macrofossil assigned to feather grasses, or a closely relative genus, was estimated to be of 34 Mya (MacGinitie, 1953; Manchester, 2001). Nonetheless, the recent molecular studies have demonstrated that the origin of feather grasses could be relatively recent in evolutionary terms (Romaschenko et al., 2014; Schubert et al., 2019). For instance, it was shown that the Eurasian Stipeae lineage has an estimated age of 15.78 (6.30-26.60) Mya, while the American Stipeae lineage is around 5.62 (0-6.50) Mya (Romaschenko et al., 2014). 39 Article 2 Title: The first draft genome of feather grasses using SMRT sequencing and its implications in molecular studies of Stipa Journal: Scientific Reports, volume 11, Article number: 15345 (2021) 2-year impact factor: 4.379 5-year impact factor: 5.133 The Ministry of Science and Higher Education of Poland: 140 points DOI: https://doi.org/10.1038/s41598-021-94068-w 40 www.nature.com/scientificreports scientific reports 0P[ The first draft genome of feather grasses using SMRT sequencing and its implications in molecular studies of Stipa Evgenii Baiakhmetov1'2 , Cervin Guyomar3''', Ekaterina Shelest3,5, Marcin Nobis1'2 & Polina D. Gudkova2,6 The Eurasian plant Stipa capillata is the most widespread species within feather grasses. Many taxa of the genus are dominants in steppe plant communities and can be used for their classification and in studies related to climate change. Moreover, some species are of economic importance mainly as fodder plants and can be used for soil remediation processes. Although large-scale molecular data has begun to appear, there is still no complete or draft genome for any Stipa species. Thus, here we present a single-molecule long-read sequencing dataset generated using the Pacific Biosciences Sequel System. A draft genome of about 1004 Mb was obtained with a contig N50 length of 351 kb. Importantly, here we report 81,224 annotated protein-coding genes, present 77,614 perfect and 58 unique imperfect SSRs, reveal the putative allopolyploid nature of S. capillata, investigate the evolutionary history of the genus, demonstrate structural heteroplasmy of the chloroplast genome and announce for the first time the mitochondrial genome in Stipa. The assembled nuclear, mitochondrial and chloroplast genomes provide a significant source of genetic data for further works on phylogeny, hybridisation and population studies within Stipa and the grass family Poaceae. In the year 2000, the Arabidopsis thaliana L. genome became the first plant genome to be completely sequenced and assembled1. Since then, many genomes from the plant kingdom have been sequenced, e.g. green algae2-3, bryophytes4-5, ferns'1, gymnosperms7* and angiosperms’’-10. In the grass family (Poaceae) the reference assemblies were primarily obtained for crops" 13 and model plants14-"’. The advent of second-generation sequencing and the subsequent decreasing of the overall sequencing costs have enabled the determination of whole genome sequences in many non-model plant species'7 20. Recently, the 1KP project that was aiming to sequence 1,000 green plant transcriptomes21-23 has been followed by the 10KP project24. The later initiative intends to sequence complete genomes from more than 10,000 plants and protists. The project is supposed to be completed in 2023 and it presumes to provide family-level high-quality reference genomes, ideally with chromosome-scale assemblies. Nevertheless, the data at the level of genera may not be processed immediately24. In comparison to other kingdoms, plants have very large genomes'3-25-2*’, high ploidy level2’ and the abundance of repetitive sequences2“ 30. Currently, to face these issues, the third-generation sequencing has been applied. The so-called single-molecule real-time (SMRT) sequencing provided by Pacific Biosciences (PacBio) ’1 and nanopore sequencing by Oxford Nanopore Technologies32 afford a range of benefits, including exceptionally long-read lengths (20 kb or more), resolving extremely repetitive and GC-rich regions and direct variant phasing32-33. In the fossil record Stipa L., or a close relative genus, is known from about 34 Mya of the upper Eocene34-35. For many decades, Stipa has been described as a genus with over 300 species common in steppe zones of Eurasia, North Africa, Australia and the Americas34’-37. According to the recent studies based on both morphological and molecular data, the genus has been reduced and currently includes over 150 species geographically confined to institute of Botany, Faculty of Biology, Jagiellonian University, Gronostajowa 3, 30-387, Kraków, Poland. 2Research Laboratory ’Herbarium', National Research Tomsk State University, Lenin 36 Ave., Tomsk 634050, Russia. 3German Centre for Integrative Biodiversity Research (iDiv), Puschstrasse 4, 04103 Leipzig, Germany. ‘Institute forGenetics, Environment and Plant Protection (IGEPP), Agrocampus Ouest, INRAE, University of Rennes 1, 35650 Le Rheu, France. 5Centre for Enzyme Innovation, University of Portsmouth, Portsmouth POI 2UP, UK. department of Biology, Altai State University, Lenin 61 Ave., Barnaul, Russia 656049. email: evgenii.baiakhmetov@ doctoral.uj.edu.pl; m.nobis@uj.edu.pl Scientific Reports | (2021)11:15345 | https://doi.org/10.1038/s41598-021-94068-w nature portfolio 41 www.nature.com/scientificreports/ Scientific Reports | Figure 1. A representative individual of Stipa capillata. Europe, Asia and North Africa'*-42. Most species of Stipa are dominants and/or subdominants in steppe plant communities4*'45 and can be used for their classification46. Moreover, some species are of economic importance mainly as pasture and fodder plants, especially in the early phases of vegetation '6’47, they can be used for soil remediation processes4*-49, in studies related to climate change50'52 and as ornamental plants (e.g. S. capillata L., S. pulcherrima K. Koch, S. pennata L.). In recent years, large-scale molecular data began to appear for Stipa: de novo transcriptome assemblies of S. purpurea Griseb.50,55, S. grandis P. A. Smirn. '4 and S. lagascae Roem. & Schult.52, whole chloroplast genomes for 19 taxa57 and raw genomic data available via the NCBI Sequence Read Archive (SRA) for S. capillatas* and S. breviflora Griseb.59. In addition, nucleolar organising regions (NORs) were sequenced for six Stipa taxa6". Nevertheless, no complete or draft genome assembly currently exists for any Stipa species. In order to fill this gap, here we aim to: (1) present for the first time a single-molecule long-read dataset (nuclear, mitochondrial and chloroplast genomes) generated using the SMRT sequencing on the PacBio Sequel platform; (2) demonstrate and discuss the potential usage of this data in further studies of Stipa. For the goals of the study we chose to sequence the entire genome of S. capillata (Fig. 1) as it is the most widespread taxon within the genus, growing on sandy to loamy, nutrient poor soils in the dry grasslands of Eurasia61. Currently, this species is increasingly attracting the interest of conservation biologists due to its large distribution range, common occurrence in the Eurasian steppes and pseudosteppes, a limited number of refugia in Europe and both great morphological and genetic variability within its range62 M. Results Assembled nuclear genome. The SMRT sequencing yielded in 23.16-fold genome coverage consisting of 25.84 Gb sequence data with an N50 read length of 17,096 bp (Supplementary Table SI). De novo assembling of PacBio reads using Flye v.2.465-66 resulted in a genome size of 1,004 Mb67 with a contig N50 of 351 kb and a GC level of 45.97%. On the other hand, another de novo assembly performed with FALCON v.0.2.56* demonstrated a smaller genome size of 773 Mb with a GC level of 46.04%. However, the Flye assembly has a better N50 of 350,543 that is almost three times bigger than for FALCON. In case of applying Purge Haplotigs vl. 1.169, the final genome size was reduced by 177 Mb with an N50 of 381,155 (Table 1) and a GC level of 45.82%. (2021) 11:15345 | https://doi.orgA0.1038/s41598-021-94068-w nature portfolio 42 www.nature.com/scientificreports/ Scientific Reports | Metrics Flye assembly FALCON assembly Flye assembly after Purge Haplotigs Length of assembly, bases 1,003,531,354 773312,558 826,891,869 Number of sequences 5.931 885 3,683 Largest length of a sequence, bases 2321.367 590,564 2,321,367 Average length of sequences, bases 169,201 88,015 224,516 N50, bases 350,543 119,836 381.155 Number of sequences with N50 837 2,061 640 N100. bases 1,001 20,078 1.014 Table 1. Statistics of the nuclear genome assemblies. Metrics Flye assembly FALCON assembly Flye assembly after Purge Haplotigs Complete BUSCOs 4,557(93.10%) 2,765 (56.50%) 4304 (87.90%) Complete and single-copy BUSCOs 2,383 (48.70%) 2,408 (49.20%) 2.916 (59.60%) Complete and duplicated BUSCOs 2,174 (44.40%) 357 (7.30%) 1,388 (28.30%) Fragmented BUSCOs 46 (0.90%) 186(3.80%) 80(1.60%) Missing BUSCOs 293 (6%) 1,945(39.70%) 512(10.50%) Total BUSCO groups searched 4,896(100%) 4,896(100%) 4,896(100%) Table 2. BUSCO statistics. Number and the total length of contigs Number and the total length of non¬ Species Number of chromosomes (n) assigned to the reference assigned contigs B. distachyon70 5 4,061 (950.13 Mb) 1,871 (53.40 Mb) 94.68% 5.32% H. vulgare_l 7 4,036 (945.36 Mb) 1,896 (58.17 Mb) 94.20% 5.80% A. tauschii72 7 4,161 (954.95 Mb) 1,771 (48.59 Mb) 95.16% 4.84% O. sat iva7 % 12 3,477 (902.39 Mb) 2,455 (101.15 Mb) 89.92% 10.08% T. aestivum7* 21 2,434 (418.14 Mb) 3.498(585.40 Mb) 41.67% 58.33% Table 3. RaGOO statistics. The subsequent analysis based on a benchmark of 4,896 conserved genes belonging to the Poales order (dataset poales_odblO) revealed that the Flye assembly has 4,557 (93.10%) completed BUSCO (Benchmarking Universal Single-Copy) genes and only 293 (6%) missing BUSCOs versus 2,765 (56.50%) and 1,945 (39.70%) for the FALCON assembly. The Flye assembly after Purge Haplotigs shows 4,304 (87.90%) completed BUSCOs and 512 (10.50%) missing BUSCOs (Table 2). Scaffolding of contigs. Nearly all contigs of S. capillata genome can be assigned to the reference chromosomes of Brachypodium distachyon L., Hordeum vulgare L. and Aegilops tauschii Coss., whereas genomes of Oryza sativa L. and especially Triticum aestivum L., have much less homology to the feathergrass assembly. In particular, 95.16% contigs of S. capillata genome were assigned to seven chromosomes of A. tauschii genome, 94.68% to five chromosomes of B. distachyon, 94.20% to seven chromosomes of H. vulgare, 89.92% to 12 chromosomes of O. sativa and only 41.67% to 21 chromosomes of T. aestivum. The total length of non-assigned contigs was reasonably low for A. tauschii (48.59 Mb), B. distachyon (53.40 Mb) and H. vulgare (58.17 Mb), whereas for O. sativa and T. aestivum it was about 101.15 Mb and 585.40 Mb, respectively (Table 3). In addition, the RaGOO grouping confidence and orientation confidence scores per chromosome ranged from 57.81 to 76.11% and from 80.03 to 95.11%, respectively, indicating that the contigs could be placed on a chromosome with an acceptable level of confidence (Supplementary' Table S2). The only exception is T. aestivum for which scores ranged from 30.49 to 47.76% for the grouping confidence score and from 57.81 to 70.19% for the orientation confidence score. Nevertheless, based on the location confidence score, the exact position of the contigs on a chromosome could not be accurately estimated, reflecting a low level of synteny to the reference genomes. In (2021)11:15345 I https://doi.orgA0.1038/s41598-021-94068-w nature portfolio 43 www.nature.com/scientificreports/ Scientific Reports | Type of repeats Number of elements Total (bp) % of genome Class I: Retrotransposon: 123324 161,756,598 16.12 SINEs 6,211 2,422,254 0.24 LINEs 26,453 19,189,619 1.91 LTR elements 90,860 140,144.725 13.97 Class II: DNA-transposon: 99.245 72,448,468 7.22 Hobo-Activator 6,824 3,826,368 0.38 Tcl-IS630-Pogo 619 500,988 0.05 Piggy Bac 1 75 0.00 Tourist/Harbinger 11,326 3,980,231 0.40 Other 2 113 0.00 Unclassified 758,908 344,622,074 34.34 Total repeats 981,677 578,827,140 57.68 Rolling-circles 3,306 2,797.158 0.28 Low complexity 18,762 1,145,428 0.11 Simple repeats 114,826 5,716,291 0.57 Table 4. Statistics of repetitive elements. particular, the score was in a range of 31.30-43.66% for O. sativa, 26.06-39.13% for B. distachyon, 19.56-31.41% for H. vulgare, 17.47-24.15% for A. tauschii and 10.30-38.23% for T. aestivum. Transposable elements and nuclear genome annotation. Identification of transposable elements (TEs) revealed that more than half of the S. capillata genome (57.68%) is occupied by repetitive sequences. Particularly, retrotransposons represent at least 16.12% and transposons are reaching no less than 7.22% of the genome. Nonetheless, 34.34% of TEs are currently unclassified. Among classified repeats, long terminal repeats (LTRs) were the most abundant elements within retrotransposons, whereas Tourist/Harbinger elements were more common amid DNA-transposons. In total, 114,826 sequences were identified as simple repeats and occupy 0.57% of the genome. In addition, rolling-cirdes (0.28% of the genome) and low complexity sequences (0.11% of the genome) were found (Table 4). The subsequent structural annotation of the masked genome revealed 53,535 nuclear genes (Supplementary File 1). On the other hand, the unmasked genome has 154,755 structurally annotated genes and 94,237 of them have BLAST hits in the NCBI non-redundant database. Nonetheless, among the 94,237 genes of the unmasked genome, 12,094 sequences are related to transposable elements. In particular, 2,925 genes associated with transposons, and 9,859 assigned to retrotransposons. In addition, 229 genes encode transposase-related proteins. Thus, except transposable elements the unmasked genome has 81,224 genes that can be associated with already known proteins (Supplementary File 2). SSR markers. In total, 77,614 perfect repeat motifs were identified for the nuclear genome assembly using Krait75 (Supplementary File 3). Within those, di- and tri-nucleotides were the most common types, accounting 28,365 (36.55%) and 25,794 (33.23%) repeats, respectively. Tetra-nucleotide motifs were the third most abundant repeats with 9,777 SSRs (12.60%), followed by mono-nucleotides with 6,572 SSRs (8.47%) and penta-nucleotides with 4,629 SSRs (5.96%). Hexa-nucleotides were the rarest motifs with 2,477 SSRs (3.19%). Only four mononucleotide, four di-nucleotide and three tetra-nucleotide motifs were found in the mitochondrial and chloroplast genomes. However, a total length of those SSRs was in a range of 12-16 bp. In addition, in total 58 unique repeats present only in a single copy in a range 101 -325 bp were retrieved from the analysis of TEs. Within those were four hexa-, 35 hepta-, nine octa-, five nona- and five deca- nucleotide motifs (Supplementary Table S3). Divergence time of Stipa. The Bayesian phylogenetic reconstruction based on the five loci within NORs revealed the divergence time of Stipa from Brachypodium around 30.00-35.52 Mya and the putative origin of feather grasses about 2.90-6.02 Mya (Fig. 2). Although not all branches were well supported within the genus, the current analysis confirmed the monophyly of Stipa and the general grouping of the analysed species regarding their taxonomic positions. In particular, S. capillata and S. granáis represent the section Leiostipa Dumort; S. magnifica Junge, S. narynica Nobis, S. lipskyi Roshev. and S. caucasica Schmalh. belong to the section Smirnovia Tzvelev. The remaining three groups include (1) S. orientalis Trin. and S. pennata L., (2) S. richteriana Kar. & Kir., S. lessingiana Trin. & Rupr., S. heptapotamica Golosk. and S. korshinskyi Roshev, (3) S. lagascae and S. breviflora currently have a discrepancy between morphological and molecular data. In addition, the divergence time estimation indicates that the potential origin of the clade comprising S. capillata and S. grandis is in a range of 0.672.93 Mya while the sister clade has the 95% credibility intervals for that parameter in a range of 2.38-4.78 Mya. Furthermore, the lowest genetic divergence time was registered for S. lessingiana and S. richteriana (0.00-0.48 Mya) as well as for the split between S. heptapotamica and the two above-mentioned species (0.01-0.78 Mya). The divergence times for the rest of taxa are present in Table 5. (2021) 11:15345 I https://doi.org/10.1038/s41598-021-94068-w nature portfolio 44 www.nature.com/scientificreports/ Scientific Reports | Figure 2. Phylogcny and divergence time estimation by molecular clock analysis. Letters at each node refer to Table 5. Numbers in brackets represent the Bayesian posterior probabilities (BPP >0.50 only). The blue rectangles on the nodes indicate the 95% credibility intervals (Cl) of the estimated posterior distributions of the divergence times. The red circles indicate the presumed divergence time splits set as a reference. The scale on the bottom shows divergence time in Mya. The figure was created using Figtree vl.4.4, https://tree.bio.ed.ac.uk/ software/ figtree/. Node Node age (Mya) BPP 95% C! A 48.59 1.00 44.53-52.78 B 32.77 1.00 30.00-35.52 C 4.39 1.00 2.90-6.02 D 3.55 0.40 2.38-4.78 E 3.02 0.28 1.95-4.14 F 2.26 0.40 1.21-3.40 G 2.15 0.63 1.05-3.32 H 2.04 1.00 1.15-3.02 I 1.77 0.85 0.76-2.87 I 1.73 1.00 0.67-2.93 K 1.56 0.96 0.81-2 J8 L 0.91 0.67 0.28-1.60 M 0.71 1.00 0.11-1.46 N 0.33 0.39 0.01-0.78 O 0.16 0.28 0.00-0.48 Table 5. Node ages, BPP and Cl related to Fig. 2. (2021)11:15345 I https://doi.org/10.1038/s41598-021-94068-w nature portfolio 45 www.nature.com/scientificreports/ Scientific Reports | Figure 3. Visualisation of the de novo mitochondrial and chloroplast genome assemblies using Bandage v.0.8.185. (a) Contigs representing mitochondrion, (b) Contig representing chloroplast. Different colours represent different contigs; length (in bp) and coverage (x) of edges within contigs are shown. The figure was created using Bandage v.0.8.1, https://rrwick.github.io/Bandage/. Assembled mitochondrial and chloroplast genomes. The resulting Flye assembly contained four mitochondrial contigs with a total length of 438,037 bp76 79 represented by six edges and an entire 137,832 bp-long circular chloroplast genome combining a long single copy region (LSC) of 81,710 bp, a short single copy region (SSC) of 12,836 bp and two inverted repeats (IR) of 21,643 bp each (Fig. 3). However, after a manual checking in IGV v.2.8.6s" the final size of the chloroplast genome was slightly reduced to 137,823 bp. In addition, an analysis using Cp-hap81 detected two structural haplotypes of the chloroplast genome: haplotype A87 (LSC—IR, reverse-complement (rc)—SSCrc—1R) and haplotype B8' (LSC—IRrc—SSC—IR). We also obtained one assembly using Unicycler v.0.4.884 resulted in 76 linear contigs from which 29 can be assigned to mitochondrial sequences with a total length of 1,668,569 bp. Due to the Unicyder assembly being more complex and none of the obtained contigs were likely to be circular in nature, for the downstream genome annotation we used the Flye assembly. In total, 112 and 133 genes were functionally annotated for mitochondrial and chloroplast genomes, respectively. The mitochondrial annotation resulted in 78 protein-coding genes, 4 ribosomal RNA genes and 30 tRNA genes. The chloroplast annotation contained 85 protein-coding genes, 8 ribosomal RNA genes and 40 tRNA genes. The chloroplast genome size of 137,823 bp generated with Flye and the number of annotated genes in the current study were similar to the known assemblies for S. capillata obtained by Illumina sequencing”. However, the previous genome assemblies were slightly longer, specifically 137,830 bp81' and 137,835 bp*7. DArTseq markers. The DArT pipeline analysis resulted in 61,328 Silico markers and in 52,970 sequences with SNPs. The BLAST process revealed 58,701 Silico markers and 52,252 sequences with SNPs that were successfully mapped to 4,361 and 3,935 genome contigs, respectively. Thus, the current genome assembly has 95.72% of Silico markers and 98.64% of sequences with SNPs that are represented in 73.52% (the total length of 969.30 Mb) and 66.34% (940.37 Mb) of the contigs, respectively. In addition, we established that 50,953 Silico markers and 47,181 sequences with SNPs were present only in a single copy in the genome. Finally, we identified 30 Silico markers and 10 sequences with SNPs aligned to the mitochondrial genome and only 2 Silico markers and 4 sequences with SNPs that were found in the chloroplast genome. Discussion The number of sequenced plant genomes is rapidly increasing year by year serving as a fundamental resource for various genomic studies. In the current work, we present a 1004 Mb genome with the 23 x coverage of the most widespread feather grass species, S. capillata, using SMRT PacBio sequencing. The current assembly comprises 5,931 sequences with a contig N50 length of 351 kb (Table 1). The BUSCO completeness score of 93.10% (Table 2), the observation of a large portion of TEs (57.68%, Table 4) and the presence of Silico (95.72%) and SNPs (98.64%) markers derived from the DArT platform indicate that the assembly is of high quality. Moreover, the proportion of TEs has been reported for the first time in the genus due to the previous de novo assemblies which were performed exclusively based on transcriptomic data50,52’54. In addition, here we also attempted to perform a reference-guided scaffolding of the assembled contigs. Nevertheless, although nearly all contigs of the S. capillata genome were assigned to the chromosomes of B. distachyon, H. vulgare and A. tauschii, it was not possible to estimate their proper position on the reference with an acceptable level of confidence (Table 3 (2021) 11:15345 I https://doi.org/10.1038/s41598-021-94068-w nature portfolio 46 www.nature.com/scientificreports/ Scientific Reports | and Supplementary Table S2). In general, in the absence of a high-density genetic linkage map the task of reconstructing pseudomolecules of chromosomes seems to be challenging. On the other hand, we believe that in order to improve the contiguity of the long-read assembly the high-throughput chromosome conformation capture (Hi-C)88 technique should be applied. Currently, many studies on non-model species successfully utilised a combination of long-read techniques and Hi-C data to perform assemblies at chromosome scale89 Moreover, an additional key for improving this genome assembly in the future is merely to get more sequencing reads. Recently, it was shown that contig length metrics are positively correlated with both read length and sequence coverage. Specifically, long-read assemblies in maize demonstrated that the highest contig N50 of 24.54 Mb was reached with a subread N50 of 21,166 bp and a 75-fold depth of coverage while the longest contig of 79.68 Mb was observed with the same subread N50 but with a 60-fold depth1*7. The newly generated genome has a GC content of45.97% that is similar to the known estimates for species in Stipa varying in a range of 46.61 -49.05%'*', and more broadly to grasses ranging from 43.57% in O. saliva to 46.90% in Z. mays'*4. Recently, it was shown that a higher GC content in monocots is associated with adaptation to extremely cold and/or dry climates95. The genus Stipa highly supports this hypothesis due to the fact that all feather grasses are adapted to temperate, dry climates'''. In addition, a positive correlation between the GC content and genome size was established98 suggesting insertion of LTR retrotransposons as a potential driving force of genome enlargement97. Similarly, here we showed that the expansion of the S. capillata genome also resulted from insertions of repetitive sequences that occupy 57.68% of the genome including LTR retrotransposons (13.97%). However, among all repetitive sequences around 34.34% are currently unclassified (Table 4). Nonetheless, the total proportion of TEs in S. capillata in comparison to other species within the Poaceae family is close to Oryza minuta J. Presl (58.35%) and O. alta Swallen98 (57.54%), bigger than in B. distachyon99 (28.10%) and O. sativa'00 (45.52%) and smaller than in O. granulata Nees & Am.101 (67.96%), Avena sativa L.102 (69.47%) and T. aestivum103 (84.67%). Importantly, the presented genome size is roughly twice smaller than the expected size of 2,355 Mb and twice bigger than the expected monoploid size of 589 Mb estimated using flow cytometry9’. Considering that we were unable to remove redundant sequences due to possible heterozygosity and the number of duplicated BUSCOs (Tables 1 and 2), it may be presumed that the current genome assembly combines two very distinct genomes. To the current knowledge, the vast majority of Stipa species have 44 (2n = 4x) chromosomes and are supposed to be tetraploids11-104. In addition, recently it was shown that a single-copy region ACC1 and a low-copy nuclear gene At 103 have two different copies in Stipa10410*. Thus, it may suggest that S. capillata, and the genus Stipa in general, has arisen through hybridisation between genetically distant diploid species (2n=22) and the subsequent allopolyploidisation via whole genome duplication (WGD) rather than via one WGD event of an ancestral species. Well-documented examples of natural allopolyploid taxa in the Pooideae subfamily are Triticum turgidum L. (2n = 4x=28, genome constitution AABB) and T. aestivum (2n = 6x=42, AABBDD) formed through hybridisation and successive chromosome doubling of ancestral diploid species T. urartu (2n = 2x = 14, AA), Aegilops spcltoidcs Tausch. (2n=2x= 14, BB) and A. tauschii (2n = 2x= 14, DD)108. Moreover, in the tribe Stipeae based on the Atl03 gene allopolyploidy was reported for the genus Palis Ohwi (2n = 46, 48)'05. Heretofore, at least three hypotheses were considered regarding the base chromosome number in Stipeae: x = 7lo:, x= ll108-109 and x= 12"°. Recently, it was suggested that the latter two are more plausible41-"14. Thus, in order to better assemble the S. capillata genome and verify if Stipa is an allopolyploid genus we suggest sequencing at chromosome level the close relative diploid species (2n = 22) from genera representing, e.g. Ptilagrostis Griseb., Achnatherum P. Beauv., e.g. A. calamagrostis L. (2n = 22 + 0—2B), or Piptatheropsis Romasch., P. M. Peterson & Soreng (2n = 20, 22, 24) “-104. In general, the number of genes in Poaceae varies from 28,835 in the smallest known genome, Oropctium thomaeum Trin. (2n = 20; genome size of 245 Mb)“1, to 107,891 in T. aestivum (2n=42; 14,547 Mb)"7. Here, we reported 53,535 nuclear genes that were structurally annotated for the masked genome assembly. Such a number of genes was roughly 1.8 and 1.6 times smaller than previously determined for S. grandis (94,674 genes)54 and S. purpurea (84,298 genes)50, respectively. On the other hand, the annotation analysis of the unmasked genome resulted in 81,224 genes associated with already known proteins. In comparison, only 65,047 functionally annotated genes were reported for S. grandis while S. purpurea had 58,966. Nonetheless, as RNA-seq data is currently unavailable for S. capillata, we believe that the current version of the genome annotation demands a further investigation to properly characterise the genes sets when the appropriate information will be available. SSR markers are widely distributed across the genome and they are commonly applied in establishing genetic structure in Stipa. Previously, polymorphic microsatellite primers were reported in populations of S. purpurea (ll"3, 15m and 29"5 loci), S. pennata (7 loci"6), S. breviflora (21 loci117) and S. glareosa (9 loci118). In the present study, we identified 77,614 perfect SSR markers (Supplementary File 3) and 58 imperfect repeat motifs presented only in a single copy (Supplementary Table S3). Although we did not test them on the population level we are confident that such a number of new loci will be a valuable source for the farther development of SSR markers in S. capillata, and more generally in the genus Stipa. Additionally, the revealed loci could be used for the designing dominant inter simple sequence repeat (ISSR) markers119. Recently, the usefulness of applying ISSRs were shown for studies in S. bungeana'20, S. ucrainica and S. zalesskii12’, S. tenacissima'22 and the hybrid complex S. heptapotamica'23. According to the previous studies, based on three chloroplast loci124 and four chloroplast loci and one nuclear region105, it was shown that the origin of Stipeae can be estimated in a range of 30.60-47.30 Mya and 21.20-39 Mya, respectively. Here, based on the five loci within NORs we demonstrated that the potential split between Stipa representing the tribe Stipeae and Brachypodium (the tribe Brachypodieae) took place approximately 30-35.52 Mya that supports the previous findings105-124-125. The present results also suggest that the genus Stipa likely originated ca. 4.39 (2.90-6.02) Mya. On the other hand, one previous study indicated the origin of feather grasses at about 12.90 Mya174 while another one showed different estimates based on chloroplast loci (21.20 Mya, 13-22) and the At¡03 region105. Specifically, two copies of At¡03 had the following suggested ages: 15.78 (2021)11:15345 I https://doi.org/10.1038/s41598-021-94068-w nature portfolio 47 www.nature.com/scientificreports/ Scientific Reports | (6.30-26.60) Mya for the Eurasian Stipeae lineage and 5.62 (0-6.50) Mya for the American Stipeae lineage105. Thus, the latter estimate is close enough to the origin-age calculated in the current study. In addition, our data on the divergence time among S. richteriana, S. lessingiana and S. heptapotamica (Fig. 2 and Table 5) conforms to the previous findings on the ongoing hybridisation among these taxa173 suggesting NORs as a useful tool for revealing species of putative hybrid origin. Nonetheless, we believe that the current and previous estimates regarding the origin of Stipa should be treated with caution. Firstly, to our knowledge, there is still no available fossil data for any Stipa species from the Old World that can properly calibrate the historical diversification in the genus. Currently, the earliest definite Stipa caryopses were found in central Poland and are dated ca. 4,000 BC176. Secondly, available data demonstrate incongruence between chloroplast and nuclear loci analyses. In further studies we suggest utilising single-copy nuclear genes derived from whole genome sequencing projects. Thirdly, different sets of species and parameters used for inferring diversification dates may result in different estimates177. Finally, we report a 137,823 bp chloroplast genome that is similar to the known assemblies in Stipa and specifically in S. capillata*2. Here we highlight the applicability of a long-read sequencing technology like PacBio for the straightforward assembling of plastomes using Flye1’7 '’*. In addition, due to the long-reads we were able to identify two haplotypes presented in S. capillata. This result supports the previous findings in Poaceae*1 suggesting that plastome structural heteroplasmy can be a common state in feather grasses. Moreover, for the first time in the genus Stipa, here we present a 438,037 bp mitochondrial genome. The current size of this genome is close to Alloteropsis semialata (R.Br.) Hitchc. (442,063 bp)17*, T. aestivum (452,526 bp)129, Sorghum bicolor L. (468,628 bp)15" and A. speltoides (476,091 bp)131. Nevertheless, the present version of the genome is constituted by four contigs rather than one circular sequence. Although the general acceptance among mitochondrial biologists is that plant mitochondrial genomes have a variety of configurations152'151, in order to verify if a more accurate assembly could be performed, we suggest reusing our data for a more comprehensive analysis of the mitochondrial structures within Stipa. Materials and methods Plant material and DNA extraction. Our research complies with relevant institutional, national, and international guidelines and legislation. A S. capillata sample from Kochkor River Valley, central Kyrgyzstan (Supplementary Table S4), was selected for genome sequencing. The sample was stored in silica gel at ambient temperature until DNA extraction was performed. Total genomic DNA was isolated from dried leaves after a six-month storage period using a CTAB large-scale DNA extraction protocol (Supplementary information SI, described in Supplementary File 6). DNA extraction was performed by SNPsaurus (USA). In addition, we isolated DNA from dried leaves using a Genomic Mini AX Plant Kit (A&A Biotechnology, Poland). Subsequently, quality check, quantification and concentration adjustment were accomplished using a NanoDrop One (Thermo Scientific, USA) and agarose gel electrophoresis visualisation. The concentration of the sample was adjusted to 50 ng/pL. The purified DNA sample (1 pg) was sent to Diversity Arrays Technology Pty Ltd (Canberra, Australia) for sequencing and DArT marker identification. Moreover, to test the phylogenetic power of NORs in Stipa, we supplemented the study with five specimens of S. richteriana Kar. & Kir, three of S. lessingiana Trin. & Rupr., four of S. heptapotamica Golosk. and four of S. korshinskyi Roshev. (Supplementary Table S4). The isolation of genomic DNA was performed from dried leaf tissues using a modified CTAB method15'. Library construction and sequencing. In total, 5 ug of S. capillata genomic DNA were used to construct a PacBio library according to the 20 kb PacBio template preparation protocol omitting a shearing step. The size selection cut-off was set at 15 kb. The library preparation followed by sequencing on three PacBio Sequel SMRT cells (Pacific Biosciences, Menlo Park, CA, USA) was carried out by SNPsaurus, LLC. Prior to the assembly, reads from each SMRT cell were inspected and quality metrics were calculated using SequelQC v.1.1.0136. A high-density assay using the DArT complexity reduction method for S. capillata was performed according to a previously reported procedure157. For the rest of the specimens used in the current study, the quality control using a fluorometer (PerkinElmer Victor3, USA) and gel electrophoresis, library construction using a TruSeq Nano DNA Library kit (350 bp insert size; Illumina, USA) and sequencing using 100 bp paired-end reads on an Illumina HiSeq 2500 platform (Illumina, USA) were performed by Macrogen Inc. (South Korea). Nuclear genome assembly and validation. The execution of this work involved using many software tools, whose versions, settings and parameters are described in Supplementary information S2 (available in Supplementary File 6). The de novo assembly of the PacBio data was performed using Flye v.2.465,66. The draft assembly was cleaned by running BLASTn v.2.10.013* against the NCBI nucleotide database v.5, and subsequently sending each BLAST hit to the JGI taxonomy server (https://taxononry.jgi-psf.org/) with a downstream step of keeping only plant contigs. Thereafter, Qualimap v.2.2.2139 was used to identify mean coverage for each contig. In the final assembly we kept only contigs with an average coverage of more than lOx. In addition, overrepresented contigs (>60x) were BLASTed against the NCBI nucleotide database v.5 and sequences assigned to chloroplasts and mitochondria were removed. Due to the final assembly performed with Flye v.2.4 being roughly twice bigger than an expected monoploid genome size of 589 Mb93, we accomplished an additional assembly with FALCON v.0.2.5“ and applied Purge Haplotigs vl.1.1*9 to filter redundant sequences due to possible heterozygosity. The assemblies’ statistics were analysed using assembly-stats v. 1.0.1140. In addition, in order to assess the completeness of the genome assemblies, we investigated the presence of highly conserved orthologous genes using BUSCO v.4.0.6141. (2021)11:15345 I https://doi.org/10.1038/s41598-021-94068-w nature portfolio 48 www.nature.com/scientificreports/ Scientific Reports | Scaffolding of contigs. Due to there being no reference genome for any Stipa species, here we applied RaGOO v.1.1142 to verify if a reference-guided scaffolding can be performed for the draft genome contigs based on four genomes from the Pooideae subfamily (B. distachyon70, H. vulgareA. tauschii'2, T. aestivum74) and one genome from the Oryzoideae subfamily (O. sativa '). The subsequent assessment of the scaffolding accuracy was based on three parameters: (1) location confidence score, (2) orientation confidence score and (3) grouping confidence score142. Repeat prediction and nuclear genome annotation. The repeat prediction for S. capillata was performed using a de novo transposable element (TE) family identification and modeling package RepeatModeler v.2.0.1143 which includes three repeat finding programs; RECON144, RepeatScout14', and TRF141’. The resulting TE library was supplemented by the transposable elements database (Release 19, http://botserv2.uzh.ch/kelld ata/trep-db/).147 Subsequently, the genome assembly was masked for TEs regions by RepeatMasker v.4.1.014* (http://repeatmasker.org) with the search engine RMBlast V.2.9.0+149 and the custom library created in the previous step. Next, gene and protein sequences were predicted using Augustus v.3.2.3 with the unmasked and v.3.3.3150 with the masked genome assemblies. The predicted protein sequences of the unmasked assembly were then BLASTed against the NCBI protein database v.5 and the subsequent BLAST hit descriptions were added to GFF (General Feature Format) files. Genome-wide identification of microsatellite markers. The unmasked nuclear genome, chloroplast and mitochondrial genome assemblies were screened for perfect mono-, di-, tri-, tetra-, penta- and hexa-nucleo-tide repeat motifs using Krait v. 1,3.375. We applied the following criteria: mono-nucleotide repeat motifs contain at least 12 repeats, di-nucleotide repeat motifs contain at least seven repeats, tri-nucleotide repeat motifs contain at least five repeats, tetra-, penta- and hexa-nucleotide repeat motifs contain at least four repeats. Divergence time of Stipa. In order to estimate the divergence between S. capillata and other Stipa species we used the nucleolar organising regions. Firstly, we prepared a set of reference sequences including S. lipskyi Roshev.151, S. magnified Junge1'2, S. narynica Nobis153, S. caucasica Schmalh.154, 5. orientalis Trin.155 and S. pen-nata L.156. Secondly, we mapped raw reads of S. capillata, S. richteriana, S. lessingiana, S. heptapotamica and S. korshinskyi (Supplementary Table S2) as well as S. grandis5\ S. breviflora59, S. lagascae157 to the reference set using Minimap2 v.2.17-r941| ,B with keeping only uniquely mapped reads by Samtools v.1.9159. Thirdly, the de novo assembly of the NORs was performed using Canu v.2.01'’0 for S. capillata and SPAdes v.3.14.1161 for the rest of Stipa species. Additionally, we added to the analysis B. distachyon"'2 as an ingroup member of the Pooideae subfamily and O. saliva"'1 as an outgroup representing the Oryzoideae subfamily within the Poaceae family. Next, all sequences were aligned using MAFFT v.7.471’64. Subsequently, the aligned sequences were visualised in AliView v.1.26165 and divided in five loci: (1) 18S ribosomal RNA, (2) Internal Transcribed Spacer 1 (ITS 1), (3) 5.8S ribosomal RNA, (4) Internal Transcribed Spacer 2 (1TS2) and (5) 26S ribosomal RNA (Supplementary File 4). Estimation of divergence times was performed in BE AST2 v.2.6.3166 using the 121,321 substitution model determined by bModelTest167. We used the following constraints for time calibrations: 38-48 million years ago (Mya) for the Brachypodium-Oryza split101 and 33-39 Mya for the potential origin and divergence of Stipa4' 3*. Then, the divergence time was estimated using the strict clock model and the Yule prior. In total, we ran the analysis three times independently, 50 million Markov chain Monte Carlo (MCMC) generations for each run. The log and tree files were combined using LogCombiner v.2.6.3 (a part of the BEAST package) with the first five million generations discarded as burn-in from each run. Next, Tracer v.1.7.116* was used to check the log files regarding Effective Sample Size (ESS) values. As all ESSs exceeded 200, we summarised the final maximum clade credibility tree (Supplementary File 5) in TreeAnnotator v.2.6.3 (a part of the BEAST package). The final tree was visualised and edited using FigTree v. 1.4.4169. Mitochondrial and chloroplast genomes assembly, annotation and validation. Prior to assembly, we mapped raw reads to 11 reference mitochondrial genomes of species belonging to the Poaceae family (Supplementary Table S5) using Minimap2 v.2.17-r94115*. Only uniquely mapped reads were kept by Samtools v. 1.9159 for the next step. De novo mitochondrial assembly of the 4.08 Mb data was performed using Flye v.2.7.1 -bl590. In the next step, we BLASTed the resulting contigs against the NCBI nucleotide database v.5, and sequences assigned to mitochondria were kept. Then, the PacBio subreads were mapped onto the kept contigs using Mini-map2, and only uniquely mapped reads were retained by Samtools. A new de novo assembly of the 15.51 Mb data was performed using Flye. In order to check if the mitochondrial contigs obtained by Flye could be merged into larger scaffolds we applied Circlator v.1.5.5170. However, the resulting sequences were identical to the Flye contigs. In addition, we used Unicycler v.0.4.884 with reads that were mapped onto the Flye contigs as a reference. Further, to detect all possible structural haplotypes of the chloroplast genome we applied Cp-hap”. Next, we mapped raw reads onto the resulting mitochondrial contigs and the chloroplast genomes to manually check in IGV v.2.8.680 if any potential SNPs or indels are present. Eventually, annotations of the final mitochondrial contigs of 438,037 bp and the chloroplast genomes of 137,823 bp were performed using Geneious Prime v.2021.1.1 (https://www.geneious.com) based on 85% and 95% similar ities to the reference genomes of mitochondria and chloroplasts, respectively (Supplementary Table S5). In Silico mapping of DArT marker sequences. Since the DArT markers are designed to target active regions of the genome171, here we use them to validate the completeness of the nuclear genome assembly and (2021)11:15345 | https://doi.org/10.1038/s41598-021-94068-w nature portfolio 49 www.nature.com/scientificreports/ Scientific Reports | improve the accuracy of data filtering in further genomic studies on Stipa. Two data types, Silico and SNPs markers, were mapped to the nuclear genome using BLASTn v.2.10.0. As a query we used trimmed DArT sequences in a range of 29-69 bp with the percent identity values to the reference genome of 95% or greater and removing alignments below 95% of a query. Data availability The raw PacBio reads are available at NCBI Sequence Read Archive172. The final genome assemblies are deposited into NCBI Assembly database under the following Accession Numbers: nuclear assembly (JAGXJFOOOOOOOOO)117; mitochondrion assembly, contig 1 (MZ161090) ”, contig 2 (MZ161091)77, contig 3 (MZ161093)7" and contig 4 (MZ161092)79; chloroplast assemblies, haplotype A (MZ146999)82 and haplotype B (MZ145043)8'. The masked and the unmasked versions of the nuclear genome annotation are presented in the Supplementary File 1 and the Supplementary File 2, respectively. Received: 17 November 2020: Accepted: 24 June 2021 Published online: 28 July 2021 References 1. Initiative, T. A. G. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408,796-815. https:// doi.org/10.1038/35048692 (2000). 2. Moreau, H. et aL Gene functionalities and genome structure in Bathycoccus prasinos reflect cellular specializations at the base of the green lineage. Genome Biol. 13, R74. https://doi.org/10.1186/gb-2012-13-8-r74 (2012). 3. Hamaji, T. el al. Anisogamy evolved with a reduced sex-determining region in volvocine green algae. Com/nun. Biol. I, 17. https://doi.org/10.1038/s42003-018-0019-5 (2018). 4. Rensing, S. A. et al. The Physcomitrclla genome reveals evolutionary insights into the conquest of land by plants. Science 319, 64-69. https://doi.Org/10.l 126/science.l 150646 (2008). 5. Bowman, ]. L. et al. Insights into Land Plant Evolution Garnered from the Marchantia polymorpha Genome. Cell 171,287-304. https://doi.Org/10.1016/j.cell.2017.09.030 (2017). 6. Li, F. W. et al. Fern genomes elucidate land plant evolution and cyanobacterial symbioses. Nat. Plants 4, 460-472. https://doi. org/10.1038/s41477-018-0188-8 (2018). 7. Nystedt, B. et al. The Norw ay spruce genome sequence and conifer genome evolution. Nature 497, 579-584. https://doi.org/10. 1038/naturel2211 (2013). 8. Mosca, E. et al. A reference genome sequence for the European silver fir (Abies alba Mill): A community-generated genomic resource. G3: Genes Genomes, Genetics 9, 2039-2049. https://doi.org/10.1534/g3.! 19.400083 (2019). 9. Amborella Genome Project. The Amborella genome and the evolution of flowering plants. Science 342,1241089. https://doi.org/ 10.1126/science. 1241089 (2013). 10. Strijk, J. S., Hinsinger, D. D., Zhang, F. & Cao, K. Trochodendron aralioides, the first chromosome-level draft genome in Tro-chodendrales and a valuable resource for basal eudicot research. GigaScience 8,11. https://doi.org/10.1093/gigascience/gizl36 (2019). 11. Yu, J. et al. A draft sequence of the rice genome (Oryza saliva L. ssp. indica). Science 296, 79-92. https://doi.org/10.1126/scien ce. 1068037 (2002). 12. Paterson, A. H. et al. The Sorghum bicolor genome and the diversification of grasses. Nature 457, 551-556. https://doi.org/10. 1038/nature07723 (2009). 13. International Wheat Genome Sequencing Consortium. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, 705. https://doi.Org/10.l 126/science.aar7191 (2018). 14. Bennetzen, I. L. et al Reference genome sequence of the model plant Setaria. Nat. Biotechnol. 30, 555-561. https://doi.org/10. 1038/nbt.2196 (2012). 15. Studer, A. et al. Ihe draft genome of the C, panicoid grass species Dichanthelium oligosanthes. Genome Biol. 17, 223. https:// doi.org/10.1186/s 13059-016-1080-3 (2016). 16. Gordon, S. P. et al. Gradual polyploid genome evolution revealed by pan-genomic analysis of Brachypodium hybridum and its diploid progenitors. Nat. Commun. 11,3670. https://doi.org/10.1038/s41467-020-17302-5 (2020). 17. Yagi, M. et al. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.). DNA Res. 21, 231-241. https://doi.org/ 10.1093/dnarcs/dst053 (2014). 18. Cai, J. et al. The genome sequence of the orchid Phalacnopsis cijucstris. Nat. Genet. 47, 65-72. https://doi.org/10.1038/ng.3149 (2015). 19. Kim, Y. M. el al. Genome analysis of Hibiscus syriacus provides insights of polyploidization and indeterminate flowering in woody plants. DNA Res. 24, 71-80. https://doi.org/ 10.1093/dnares/dsw049 (2017). 20. Li, L. et al. Genome sequencing and population genomics modeling provide insights into the local adaptation of weeping forsythia. Horticulture Res. 7, 130. https://doi.org/10.1038/s41438-020-00352-7 (2020). 21. Matasci, N. et al. Data access for the 1,000 Plants (1KP) project. Gigascience 3, 17. https://doi.org/10.! 186/2047-217X-3-17 (2014). 22. Wickett, N. J. et al. Phylotranscriptomic analysis of the origin and early diversification of land plants. PNAS 111, 4859-4868. https://doi.org/10.1073/pnas.1323926111 (2014). 23. Leebens-Mack, J. H. et al. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574, 679-685. https://doi.org/10.1038/s41586-019-1693-2 (2019). 24. Cheng, S. et al 10KP: A phylodivcrse genome sequencing plan. GigaScience 7, giy013. https://doi.org/10.1093/gigascicnce/ giy013 (2018). 25. Pellicer, J., Fay, M. F. & Leitch, 1. J. The largest eukaryotic genome of them all?. Bot. J. Linn. Soc. 164, 10-15. https://doi.org/10. 1111 /j. 1095-8339.2010.01072.x (2010). 26. Stevens, K. A. et al. Sequence of the sugar pine megagenomc. Genetics 204. 1613-1626. https://doi.Org/10.1534/gcnetics.l 16. 193227 (2016). 27. Meyers, L. A. & Levin, D. A. On the abundance of polyploids in flowering plants. Evolution 60, 1198-1206. https://doi.org/10. 1111 /j.0014-3820.2006.tb01198.x (2006). 28. Flavcll, R. B., Bennett, M. D., Smith, J. B. & Smith, D. B. Genome size and proportion of repeated nucleotidc-sequence DNA in plants. Biochem. Genet. 12, 257-269 (1974). 29. Schnable, P. S. el al. The B73 maize genome: Complexity diversity and dynamics. Science 326, 1112-1115. https://doi.org/10. 1126/science.l 178534 (2009). (2021)11:15345 I https://doi.org/10.1038/s41598-021-94068-w nature portfolio 50 www.nature.com/scientificreports/ Scientific Reports | 30. Daron, J. et al. Organization and evolution of transposable elements along the bread w heat chromosome 3B. Genome Biol. 15, 546. https://doi.org/10.1186/sl3059-014-0546-4 (2014). 31. Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science 323,133-138. https://doi.Org/10.l 126/scien ce.l 162986 (2009). 32. Clarke, J. et al. Continuous base identification for single-molecule nanopore DNA sequencing. Nat. Nanotechnol. 4, 265-270. https://doi.org/10.1038/nnano.2009.12 (2009). 33. Jain, M., Olsen, H. E., Paten, B. 8c Akeson, M. The Oxford Nanopore MinlON: delivery of nanopore sequencing to the genomics community. Genome Biol. 17, 239. https://doi.org/10.1186/sl3059-016-1103-0 (2016). 34. MacGinitie, H. D. Fossil Plants Of The Florissant Beds, Colorado (Carnegie Institute of Washington Publication, 1953). 35. Manchester, S. R. Update on the megafossil flora of Florissant Colorado. Denver Museum Nat. Sci. 4, 137-161 (2001). 36. Freitag, H. The genus Stipa (Gramineae) in southwest and south Asia. Notes From Royal Botanic Garden 42, 355-489 (1985). 37. Barkworth, M. E. & Everett, J. Evolution in the Stipeae: identification and relationships of its monophyletic taxa. Grass systematics and evolution (eds. Söderström, T. R., Hilu, K. W., Campbell, C. S. 8c Barkworth, M. E.) 251-264 (Smithsonian Institution Press, 1987). 38. Hamasha, H. R., von Hagen, K. B. 8c Róser, M. Stipa (Poaceae) and allies in the Old World: molecular phylogenetics realigns genus circumscription and gives evidence on the origin of American and Australian lineages. Plant Syst. Evol. 298, 351-367. https://doi.org/10.1007/s00606-011-0549-5 (2012). 39. Kellogg, E. A. Subfamily Pooideae in The families and genera of vascular plants (ed. Kubitzki, K.) 199-229 (Springer International Publishing, 2015). 40. Nobis, M. Taxonomic revision of the Central Asiatic Stipa tianschanica complex (Poaceae) with particular reference to the epidermal micromorphology of the lemma. Folia Geobot. 49, 283-308. https://doi.org/10.1007/sl2224-013-9164-2 (2014). 41. Romaschenko, K. et al Systematics and evolution of the needle grasses (Poaceae: Pooideae: Stipeae) based on analysis of multiple chloroplast loci, ITS, and lemma micromorphology. Taxon 61, 18-44. https://doi.org/10.1002/tax.611002 (2012). 42. Nobis, M., Gudkova, P. D., Nowak. A., Sawicki, J. & Nobis, A. A synopsis of the genus Stipa (Poaceae) in Middle Asia, including a key to species identification, an annotated checklist, and phytogcographic analyses. Ann. Mo. Bot. Gard. 105, 1-63. https:// doi.org/10.3417/2019378 (2020). 43. Yunatov, A. A. Main patterns of the vegetation cover of the Mongolian peoples republic. Proc. Mongolian Commission 39, 233 (1950). 44. Lavrenko, E. M., Karamasheva, Z. V. 8c Nikulina, R. I. Eurasian steppe. 143 (Nauka, 1991). 45. Nowak, A., Nowak. S., Nobis, A. & Nobis, M. Vegetation of feather grass steppes in the western Pamir Alai Mountains (Tajikistan. Middle Asia). Phytocoenologia 46, 295-315. https://doi.Org/10.l 127/phyto/2016/0145 (2016). 46. Danzhalova, E. V. et al. Indicators of pasture digression in steppe ecosystems of Mongolia. Exploration Biol Resour. Mongolia 12, 297-306 (2012). 47. Maevsky, V. V. 8c Amerkhanov, H. H. The note of Poaceae species from former USSR flora, recommended as fodder for agricultural production. Bull Botanical Garden Saratov State Univ. 6,80-83 (2007). 48. Brunetti, G., Solcr- Rovira, P., Farrag, K. 8c Senesi, N. Tolerance and accumulation of heavy metals by wild plant species grown in contaminated soils in Apulia region Southern Italy. Plant Soil 318,285-298. https://doi.org/10.1007/sl 1104-008-9838-3 (2009). 49. Moameri, M. et al. Investigating lead and zinc uptake and accumulation by Stipa hohenackeriana Trin and Rupr in field and pot experiments. Biosci. /. 34, 138-150. https://doi.org/10.14393/BI-v34nla2018-37238 (2018). 50. Yang, Y. Q. el al. Transcriptome analysis reveals diversified adaptation of Stipa purpurea along a drought gradient on the Tibetan Plateau. Funct. Integr. Genomics 15, 295-307. https://doi.org/10.1007/sl0142-014-0419-7 (2015). 51. Lv, X., He, Q. & Zhou, G. Contrasting responses of steppe Stipa ssp to warming and precipitation variability. Ecol. Evolut. 9, 9061-9075. https://doi.org/10.1002/ece3.5452 (2019). 52. Schubert, M., Gronvold, L., Sandve, S. R., Hvidsten, T. R. 8c Fjellheim, S. Evolution of cold acclimation and its role in niche transition in the temperate grass subfamily Pooideae. Plant Physiol 180,404-419. https://doi.0rg/lO.l 104/pp. 18.01448 (2019). 53. NCBI BioSample, https://www.ncbi.nlm.nih.gov/biosample/?term=SAMN03178190 (2014). 54. Wan, D. et al. De novo assembly and transcriptomic profiling of the grazing response in Stipa grandis. PLoS ONE 10, cOl 22641. https://doi.org/10.1371 /journal.ponc.0122641 (2015). 55. NCBI Sequence Read Archive, https://www.ncbi.nlm.nih.gov/sra/?term=SRP051667 (2020). 56. ArrayExpress, https://www.ebi.ac.uk/arrayexpress/cxperiments/E-MTAB-5300 (2020). 57. Krawczyk. K., Nobis, M., Myszczyński, K., Klichowska, E. 8c Sawicki, J. Plastid superbarcodes as a tool for species discrimination in feather grasses (Poaceae: Stipa). Sci. Rep. 8, 1924. https://doi.org/10.1038/s41598-018-20399-w (2018). 58. NCBI Sequence Read Archive, https://www.ncbi.nlm.nih.gov/sra/SRR8208353 (2020). 59. NCBI Sequence Read Archive, https://www.ncbi.nlm.nih.gov/sra/SRS3290204 (2020). 60. Krawczyk, K., Nobis, M., Nowak, A., Szczecińska, M. 8c Sawicki, J. Phylogenetic implications of nuclear rRNA 1GS variation in Stipa L (Poaceae). Sci. Rep. 7. 11506. https://doi.org/10.l038/s41598-017-l 1804-x (2017). 61. Wagner, V. et al. Similar performance in central and range-edge populations of a Eurasian steppe grass under different climate and soil pH regimes. Ecography 34, 498-506. https://doi.0rg/lO.l 111/j. 1600-0587.2010.06658.x (2011). 62. Wagner, V., Durka, W. 8c Hensen, I. Increased genetic differentiation but no reduced genetic diversity in peripheral vs. central populations of a steppe grass. Am. J. Botany 98, 1173-1179. https://doi.org/10.3732/ajb.1000385 (2011). 63. Durka, W. et al. Extreme genetic depauperation and differentiation of both populations and species in Eurasian feather grasses (Stipa). Plant Syst. Evol. 299, 259-269. https://doi.org/10.1007/s00606-012-0719-0 (2013). 64. Kirschner, P. et al. Long-term isolation of European steppe outposts boosts the biomes conservation value. Nat. Commun. 11. 1968. https://doi.org/10.1038/s41467-020-15620-2 (2020). 65. Lin, Y. et al. Assembly of long error-prone reads using de Bruijn graphs. Proc. Natl. Acad. Sci. 113, 8396-8405. https://doi.org/ 10.1073/pnas. 1604560113(2016). 66. Kolmogorov, M., Yuan, J., Lin, Y. 8c Pevzner, P. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540-546. https://doi.org/10.1038/s41587-019-0072-8 (2019). 67. NCBI Assembly, https://www.ncbi.nlm.nih.gOV/assembly/IAGXIF000000000 (2021). 68. Chin, C.-H. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050-1054. https://doi.org/10.1038/nmeth.4035 (2016). 69. Roach, M. J., Schmidt, S. A. 8c Borneman, A. R. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinf. 19, 460. https://doi.0rg/lO.l 186/sl2859-018-2485-7 (2018). 70. NCBI Assembly, https://www.ncbi.nlm.nih.gOv/assembly/GCF_000005505.3 (2021). 71. NCBI Assembly, https://www.ncbi.nlm.nih.goV/assembly/GCA_903813605.l (2021). 72. NCBI Assembly, https://www.ncbi.nlm.nih.goV/assembly/GCA_002575655.l (2021). 73. NCBI Assembly. https://www.ncbi.nlm.nih.goV/assembly/GCF_001433935.l (2021). 74. NCBI Assembly, https://www.ncbi.nlm.nih.goV/assembly/GCA_002220415.3 (2021). 75. Du, L., Zhang, C., Liu, Q., Zhang, X. 8c Yue, B. Krait: an ultrafast tool for genome-wide survey of microsatellites and primer design. Bioinformatics 34,681-683. https://doi.org/10.1093/bioinformatics/btx665 (2018). 76. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/MZ161090 (2021). (2021)11:15345 I https://doi.org/10.1038/s41598-021-%068-w nature portfolio 51 www.nature.com/scientificreports/ Scientific Reports | 77. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/MZ161091 (2021). 78. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccorc/MZ161093 (2021). 79. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/MZ161092 (2021). 80. Robinson, J. T., Thorvaldsdottir, H., Wenger, A. M., Zehir, A. & Mesirov, J. P. Variant review with the integrative genomics viewer. Can. Res. 77, 31-34. https://doi.Org/10.l 158/0008-5472.CAN-17-0337 (2017). 81. Wang, W. 8c Lanfear, R. Long-reads reveal that the chloroplast genome exists in two distinct versions in most plants. Genome Biol. Evol. 11. 3372-3381. https://doi.org/10.1093/gbe/evz256 (2019). 82. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/MZ146999 (2021). 83. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/MZ145043 (2021). 84. Wick, R. R., Judd, L. M., Gorrie, C. L. 8c Holt, K. E. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput. Biol. 13, 1-22. https://doi.org/10.1371/journal.pcbi.1005595 (2017). 85. Wick, R. R., Schultz, M. B., Zobel, J. 8c Holt, K. E. Bandage: interactive visualisation of de novo genome assemblies. Bioinformatics 31, 3350-3352. https://doi.org/10.1093/bioinformatics/btv383 (2015). 86. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/NC_037026.! (2020). 87. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/MG052599.1 (2020). 88. Ghurye, j., Pop, M., Koren, S., Bickhart, D. 8c Chen-Shan, C. Scaffolding of long read assemblies using long range contact information. BMC Genom. 18. 527. https://doi.0rg/lO.l 186/s 12864-017-3879-z (2017). 89. Carballo, J. et al. A high-quality genome of Eragrostis curvula grass provides insights into Poaceae evolution and supports new strategics to enhance forage quality. Sci. Rep. 9, 10250. https://doi.org/10.1038/s41598-019-46610-0 (2019). 90. Chen, B. et al. The sequencing and de novo assembly of the Larimichthys crocea genome using PacBio and Hi-C technologies. Scientific Data 6, 188. https://doi.org/l0.1038/s41597-019-0194-3 (2019). 91. Shan, T. et al. First genome of the brown alga Undaria pinnatifida: Chromosome-level assembly using PacBio and Hi-C technologies. Front. Genet. 11, 140. https://doi.org/10.3389/fgene.2020.00140 (2020). 92. Ou, S. et al. Effect of sequence depth and length in long-read assembly of the maize inbred NC358. Nat. Commutt. 11, 2288. https://doi.org/10.1038/s41467-020-16037-7 (2020). 93. Smarda, P. et al. Genome sizes and genomic guanine + cytosine (GC) contents of the Czech vascular flora with new estimates for 1700 species. Preslia 91. 117-142. https://doi.Org/10.23855/preslia.2019.l 17 (2019). 94. Singh. R., Ming, R. & Yu, Q. Comparative analysis of GC content variations in plant genomes. Tropical Plant Biol. 9, 136-149. https://doi.org/10.1007/sl2042-016-9165-4 (2016). 95. Smarda, P. et al. Ecological and evolutionary significance of genomic GC content diversity in monocots. Proc. Natl. Acad. Sci. 111,4096-4102. https://doi.org/10.1073/pnas. 132II52111 (2014). 96. BureS, P. et al. Correlation between GC content and genome size in plants. Cytometry A 71,764 (2007). 97. Grover, C. E. 8c Wendel, J. F. Recent insights into mechanisms of genome size change in plants. /. Bot. 2010.382732. https://doi. org/10.1155/2010/382732 (2010). 98. Zuccolo, A. et al. Transposable element distribution, abundance and role in genome size variation in the genus Oryza. BMC Evol. Biol. 7. 152. https://doi.0rg/lO.l 186/1471-2148-7-152 (2007). 99. Vogel, J. et al. Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463, 763-768. https://doi. org/10.1038/nature08747 (2010). 100. Sasaki, T. The map-based sequence of the rice genome. Nature 436, 793-800. https://doi.org/10.1038/nature03895 (2005). 101. Wu, Z. et al. De novo genome assembly of Oryza granulata reveals rapid genome expansion and adaptive evolution. Commun. Biol. 1,84. https://doi.org/10.1038/s42003-018-0089-4 (2018). 102. Liu, Q. et al. The repetitive DNA landscape in Avena (Poaceae): Chromosome and genome evolution defined by major repeat classes in whole-genome sequence reads. BMC Plant Biol. 19, 226. https://doi.0rg/lO.l 186/s 12870-019-1769 z (2019). 103. Wicker, T. et al. Impact of transposable elements on genome structure and evolution in bread wheat. Genome Biol. 19, 103. https://doi.org/10.1186/s 13059-018-1479-0 (2018). 104. Tkach, N. et al. Molecular phylogenetics and micromorphology of Australasian Stipeae (Poaceae), and the interrelation of whole-genome duplication and evolutionary radiations in this grass tribe. Front. Plant Sci. 11, 630788, https://doi.org/10.3389/ fpls.2020.630788 (2021). 105. Romaschenko, K. et al. Miocene-Pliocene speciation, introgression, and migration of Palis and Ptilagrostis (Poaceae: Stipeae). Mol. Phylogenet. Evol. 70, 244-259. https://doi.Org/10.1016/j.ympev.2013.09.018 (2014). 106. Matsuoka. Y., Takumi, S. 8c Nasuda. S. Genetic mechanisms of allopolyploid speciation through hybrid genome doubling: novel insights from wheat (Triticum and Aegilops) studies. Int. Rev. Cell Mol. Biol. 309, 199-258. https://doi.org/10.I016/b978-0-12-800255-1.00004-1 (2014). 107. Tzvelev, N. N. On the origin and evolution of the feathergrasses (Stipa L.). Problems of ecology, geobotany, botanical geography and floristics (eds. Lebedev, D. V. 8c Karamysheva, Z. V.) 139-150 (Academiya Nauk SSSR, 1977). 108. Clayton, W. D. 8c Renvoize, S. A. Genera Graminum. Kew Bull. Additional Ser. 13, 1 -389 (1986). 109. Hilu, K. W. Phylogenetics and chromosomal evolution in the Poaceae (grasses). Aust. J. Bot. 52, 13-22. https://doi.org/10.1071/ BT03103 (2004). 110. Avdulov, N. P. Karyo-systcmatische Untcrsuchung der Familie Gramineen. Bull. Appl. Bot. Genet. Plant Breed. 43,1-352 (1931). 111. VanBuren, R., Wai, C. M., Keilwagen, J. & Pardo, J. A chromosome-scale assembly of the model desiccation tolerant grass Oropetium thomaeum. Plant Direct 2, e00096. https://doi.org/l0.1002/pld3.96 (2018). 112. Appels, R. et al. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, eaar7191. https://doi.org/10.1126/science.aar7191 (2018). 113. Liu, W. et al. Morphological and genetic variation along a North-to-South transect in Stipa purpurea, a dominant grass on the qinghai-tibetan plateau: implications for response to climate change. PLoS ONE 11, e0161972. https://doi.org/10.1371/journal. pone.0161972 (2016). 114. Liu, W., Liao, H„ Zhou, Y., Zhao, Y. 8c Song, Z. Microsatellite primers in Stipa purpurea (Poaceae), a dominant species of the steppe on the Qinghai-Tibetan Plateau. Am. J. Bot. 98, el50-el51. https://doi.org/10.3732/ajb.1000444 (2011). 115. Yin, X., Yang, Y. 8c Yang, Y. Development and characterization of 29 polymorphic EST-SSR markers for Stipa purpurea (Poaceae). Appl. Plant Sci. 4, 1600027. https://doi.org/10.3732/apps. 1600027 (2016). 116. Klichowska, E., Slipiko, M., Nobis, M. 8c Szczecinska, M. Development and characterization of microsatellite markers for endangered species Stipa pennata (Poaceae) and their usefulness in intraspecific delimitation. Mol. Biol. Rep. 45,639-643. https://doi. org/10.1007/s 11033-018-4192-x (2018). 117. Ren, J. et al. Development and characterization of EST-SSR markers in Stipa brevijlora (Poaceae). Applications in Plant Sciences 5. 1600157. https://doi.org/10.3732/apps.1600157 (2017). 118. Oyundelger, K. et al. Climate and land use affect genetic structure of Stipa glareosa P. A. Smirn. in Mongolia. Flora 266, 151572. https://doi.Org/10.1016/j.flora.2020.151572 (2020). 119. Zietkiewicz, E., Rafalski, A. 8c Labuda, D. Genome fingerprinting by simple sequence repeat (SSR)-anchored polymerase chain reaction amplification. Genomics 20, 176-183. https://doi.org/10.1006/geno.1994.U51 (1994). (2021)11:15345 I https://doi.org/10.1038/s41598-021-%068-w nature portfolio 52 www.nature.com/scientificreports/ Scientific Reports | 120. Yu, J., |ing, Z. B. 8c Cheng, J. M. Genetic diversity and population structure of Stipa bungeana, an endemic species in Loess Plateau of China, revealed using combined ISSR and SRAP markers. Genet. Mol. Res. 13, 1097-1108. https://doi.org/10.4238/ 2014.February.20.11 (2014). 121. Kopylov-Guskov, Y. O. 8c Kramina, T. E. Investigating of Stipa ucrainica m Stipa zalesskii (Poaceae) from Rostov Oblast using morphological and ISSR analyses. Bull. Moscow Soc. Nat. Biol. Ser. 119, 46-53 (2014). 122. Boussaid, M„ Benito, C., Harche, M., Naranjo, T. 8c Zedek, M. Genetic variation in natural populations of Stipa tenacissima from Algeria. Biochem. Genet. 48, 857-872. https://doi.org/10.1007/sl0528-010-9367-7 (2010). 123. Nobis, M. et al. Hybridisation, introgression events and cryptic speciation in Stipa (Poaceae): a case study of the Stipa heptapotamica hybrid-complex. Pcrspect. Plant Ecol. Evolut. Syst. 39, 125457. https://doi.org/10.1016/j.ppees.2019.05.001 (2019). 124. Schubert, M., Marcussen, T., Meseguer, A. S. & Fjellheim, S. The grass subfamily Pooideae: Cretaceous-Palaeocene origin and climate-driven Cenozoic diversification. Glob. Ecol. Biogeogr. 28, 1168-1182. https://doi.Org/10.l 111/geb. 12923 (2019). 125. Hodkinson, T. R. Evolution and taxonomy of the grasses (Poaceae): a model family for the study of species-rich groups. Annual Plant Rev. Online I, 39. https://doi.org/10.1002/9781119312994.apr0622 (2018). 126. Mueller-Bieniek, A., Kittel, P., Muzolf, B., Cywa. K. 8c Muzolf, P. Plant macroremains from an early Neolithic site in eastern Kuyavia, central Poland. Acta Palaeobotanica 56, 79-89. https://doi.org/10.1515/acpa-2016-0006 (2016). 127. Brown, R. P. & Yang, Z. Rate variation and estimation of divergence times using strict and relaxed clocks. BMC Evol. Biol. 11, 271. https://doi.org/10.1186/1471-2148-ll-271 (2011). 128. NCBI Nucleotide, https://www.ncbi.nlm.nih.gOv/nuccore/MH644808.l (2020). 129. NCBI Nucleotide, https://www.ncbi.nlm.nih.goV/nuccore/MH051716.l (2020). 130. NCBI Nucleotide, https://www.ncbi.nlm.nih.goV/nuccore/NC_008360.l (2020). 131. NCBI Nucleotide, https://www.ncbi.nlm.nih.goV/nuccore/NC_022666.l (2020). 132. Bendich, A. J. Structural analysis of mitochondrial DNA molecules from fungi and plants using moving pictures and pulsed-field gel electrophoresis. J. Mol. Biol. 255, 564-588. https://doi.org/10.1006/jmbi.1996.0048 (1996). 133. Cheng. N. et al. Correlation between mtDNA complexity and mtDNA replication mode in developing cotyledon mitochondria during mung bean seed germination. New Phytol. 213, 751-763. https://doi.0rg/lO.l 111/nph. 14158 (2017). 134. Kozik, A. et al. Ihe alternative reality of plant mitochondrial DNA: One ring does not rule them all. PLoS Genet. 15, el 008373. https://doi.org/10.1371 /journal.pgen. 1008373 (2019). 135. Doyle, I. J. 8c Doyle, J. L. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem. Bull. 19, 11-15 (1987). 136. Hufnagel, D. E., Hufford, M. B. 8c Seetharam, A. S. SequelTools: a suite of tools for working with PacBio Sequel raw sequence data. BMC Bioinf. 21, 429. https://doi.0rg/lO.l 186/s 12859-020-03751-8 (2020). 137. Baiakhmetov, E., Nowak, A., Gudkova, P. D. 8c Nobis, M. Morphological and genome-wide evidence for natural hybridisation within the genus Stipa (Poaceae). Sci. Rep. 10, 13803. https://doi.org/10.1038/s41598-020-70582 1 (2020). 138. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinf. 10, 421. https://doi.Org/10.l 186/1471 -2105-10-421 (2009). 139. Okonechnikov, K., Concsa, A. 8c Garcia-Alcalde, F. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics 32, 292-294. https://doi.org/10.1093/bioinformatics/btv566 (2016). 140. Hunt, M.: Assembly statistics from FASTA and FASTQ files (Version 1.0.1). Github https://github.com/sanger-pathogens/asscm blv-st.itv (2014). 141. Scppey, M., Manni, M. 8c Zdobnov, E. M. BUSCO: Assessing genome assembly and annotation completeness. Methods Mol. Biol. 227-245,2019. https://doi.org/10.1007/978-l-4939-9173-0_14 (1962). 142. Alongé, M. et al. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol. 20, 224. https://doi. org/10.1186/sl3059-019-1829-6 (2019). 143. Flynn, I. M. et al. RcpeatModeler2: Automated genomic discovery of transposable element families. PNAS 117, 9451-9457. https://doi.org/10.1073/pnas.1921046117 (2020). 144. Bao, Z. 8c Eddy, S. R. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 12, 1269-1276. https://doi.0rg/lO.l 101/gr.88502 (2002). 145. Price, A. L, Jones, N. C. 8c De Pevzner, P. A. novo identification of repeat families in large genomes. Bioinformatics 21, 351-358. https://doi.org/10.1093/bioinformatics/bti 1018 (2005). 146. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573-580. https://doi.org/10. 1093/nar/27.2.573 (1999). 147. Wicker, T., Matthews, D. E. 8c Keller, B. TREP: a database for Triticeac repetitive elements. Trends Plant Sci. 7, 561-562. https:// doi.org/10.1016/S 1360-1385(02)02372-5 (2002). 148. Smit, A. F. A, Hubley, R. 8c Green, P. RepeatMasker Opcn-4.0, http://www.repeatmaskcr.org, (2020). 149. Boratyn, G. M. et al. Domain enhanced lookup time accelerated BLAST. Biol. Direct 7,12. https://doi.org/10.1186/1745-6150-7-12(2012). 150. Stanke, M., Diekhans, M., Baertsch, R. 8c Haussler, D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24.637-644. https://doi.org/10.1093/bioinformatics/btn013 (2008). 151. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/KY826233 (2020). 152. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/KY826234 (2020). 153. NCBI Nucleotide, https://www.ncbi.nIm.nih.gov/nuccore/KY826235 (2020). 154. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/KY826229 (2020). 155. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/KY826231 (2020). 156. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccorc/KY826232 (2020). 157. NCBI Sequence Read Archive, https://www.ncbi.nlm.nih.gov/sra/ERR 1744610 (2020). 158. Li. H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34. 3094-3100. https://doi.org/10.I093/bioin formatics/bty!91 (2018). 159. Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078-2079. https://doi.org/10.1093/bioin formatics/btp352 (2009). 160. Koren, S., Walenz, B. P., Berlin. K., Miller, J. R. 8c Phillippy, A. M. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27,722-736. https://doi.Org/10.1101/gr.215087.l 16 (2017). 161. Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19. 455-477. https://doi.org/10.1089/cmb.2012.0021 (2012). 162. NCBI Nucleotide, https://www.ncbi.nlm.nih.gov/nuccorc/NC_016135.3?report=fasta8cfrom=1640208cto= 167409 (2020). 163. NCBI Nucleotide, https://www.ncbi.nlm.nih.goV/nuccore/KM036284.l (2020). 164. Katoh, K. 8c Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30. 772-780. https://doi.org/10.1093/molbev/mst010 (2013). 165. Larsson. A. AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics 30, 3276-3278. https:// doi.org/ 10.1093/bioinformatics/btu531 (2014). 166. Bouckaert, R. et al. BEAST 25: An advanced software platform for Bayesian evolutionary analysis. PLOS Comput. Biol. 15. 1006650. https://doi.org/10.1371/journal.pcbi.1006650 (2019). (2021)11:15345 I https://doi.org/10.1038/s41598-021-%068-w nature portfolio 53 www.nature.com/scientificreports/ 167. ßouckaert, R. & Drummond, A. bModelTest: Bayesian phylogenetic site model averaging and model comparison. BMC Evol. Biol. 17, 42. https://doi.org/10.1186/sl2862-017-0890-6 (2017). 168. Rambaut, A., Drummond, A. J., Xie, D., Baele, G. & Suchard, M. A. Posterior summarization in Bayesian phylogenetics using tracer 17. System. Biol. 67,901-904. https://doi.org/10.1093/sysbio/syy032 (2018). 169. Rambaut, A. Figtree vl.4.4 https://tree.bio.ed.ac.uk/software/figtree (2018). 170. Hunt, M. et al. Circlator: automated circularization of genome assemblies using long sequencing reads. Genome Biol. 16, 294. https://doi.0rg/lO.l 186/sl3059-015-0849-0 (2015). 171. Kilian, A. et al. Diversity arrays technology: a generic genome profiling technology on open platforms. Methods Mol. Biol. 888, 67-89. https://doi.org/10.1007/978-l-61779-870-2_5 (20121 172. NCBI Sequence Read Archive, https://www.ncbi.nlm.nih.gov/sra/PRJNA726584 (2021). Acknowledgements We would like to express our gratitude to Eric (ohnson from SNPsaurus, Artem Kasianov from Institute for Information Transmission Problems of the Russian Academy of Sciences (Moscow, Russia) and Igor A. Shmakov from Altai State University (Barnaul, Russia) for their valuable assistance in the genome assembling. We also thank the iDiv High-Performance Computing cluster for providing computing resources for this paper. Finally, we thank two anonymous reviewers for providing valuable comments on the manuscript. The study was supported by the Russian Science Foundation (grant no. 19-74-10067). E.B. was supported via the RSF (grant no. 19-74-10067) and a DS grant of the Jagiellonian University (DS/D/WB/IB/2/2019). M.N. was supported by the National Science Centre, Poland (grant no. 2018/29/B/NZ9/00313). P.D.G. was supported by the RSF (grant no.19-74-10067). The open-access publication of this article was funded by the BioS Priority Research Area under the program "Excellence Initiative - Research University" at the lagiellonian University in Krakow. Author contributions E.B., P.D.G., M.N. planned the study. E.B. supervised the research. M.N. and P.D.G. identified and collected biological samples. E.B., C.G., E.S. performed the nuclear genome assembly. E.B. performed the remaining bio-informatic analyses and wrote the manuscript. All authors revised the draft, provided comments and approved the final manuscript. Competing interests The authors declare no competing interests. Additional information Supplementary Information The online version contains supplementär)' material available at https://doi.org/ 10.1038/s41598-021 -94068- w. Correspondence and requests for materials should be addressed to E.B. or M.N. Reprints and permissions information is available at www.nature.com/reprints. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the articles Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articles Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.Org/licenses/by/4.0/. © The Author(s) 2021 Scientific Reports | (2021)11:15345! https://doi.org/10.1038/s41598-021-94068-w 54 nature portfolio Supplementary material The first draft genome of feather grasses using SMRT sequencing and its implications in molecular studies of Stipa Evgenii Baiakhmetov1-2”, Cervin Guyomar1-4, Ekaterina Shelest1'5, Marcin Nobis12”, Polina D. Gudkova2,6 1 Institute of Botany, Faculty of Biology, Jagiellonian University, Gronostajowa 3, 30-387 Kraków, Poland 2 Research laboratory 'Herbarium', National Research Tomsk State University, Lenin 36 Ave., 634050 Tomsk, Russia 'German Centre for Integrative Biodiversity Research (iDiv), Puschstrasse 4, 04103 Leipzig, Germany 4IGEPP, Agrocampus Ouest, INRAE, University of Rennes 1, 35650 Le Rheu, France 5 Centre for Enzyme Innovation, University of Portsmouth, POI 2UP Portsmouth, UK 6 Department of Biology, Altai State University, Lenin 61 Ave., 656049 Barnaul. Russia •Corresponding Authors: Evgenii Baiakhmetov1-2 Gronostajowa 3, Kraków, 30-387 Kraków, Poland Email address: evgenii.baiakhmetov@doctoral.uj.edu.pl Marcin Nobis12 Gronostajowa 3, Kraków, 30-387 Kraków, Poland Email address: m.nobis@uj.edu.pl 55 Supplementary Table SI. Statistics for the PacBio long-reads dataset. Read type Read bases Read number Read length (max) Read length (mean) Read length (N50) Subrcads, cell 1 9,748,472,200 907,975 170.673 10.736 17.383 Subreads, cell 2 7,384.978.143 813,520 102,206 9,078 15.472 Subreads, cell 3 8,706,147,385 785,976 205,974 11,077 17,542 56 Supplementary Table S2 Average RaGOO confidence scores for the draft genome of S. capillata assigned to chromosomes of the five reference species. Chromosome No. of assigned Location confidence Orientation confidence (•rouping confidence Reference species contigs scores scores score ftrachy/HHlium J is t achy an (BD) 1.312 26.06% 84 59% 73.23% HorJeum vulgare (HV) 519 22.53% 8246% 59 78% 1 Aegilops tauschii (AT) 528 24.15% 84 81% 57.81% Oryza saliva (OS) 467 34.22% 92.80% 73.72% Triticum aestivum (TA) 108 36.86% 90 34% 42.60% BD 794 28 21% 89.27% 74.85% HV 701 19 56% 82 32% 59 04% 2 AT 728 17.47% 80 03% 57.87% OS 360 36.79% 91.82% 71.46% TA 76 3491% 91 03% 43 09% BD 813 31.57% 89.29% 74.38% HV 558 21.66% 85.59% 63.04% 3 AT 677 20 62% 8493% 59 70% OS 344 33.68% 95.11% 76.11% TA 90 22.81% 88.91% 34.18% BD 825 30.23% 85.85% 67.93% HV 649 31 41% 85.27% 59.75% 4 AT 571 20.54% 84.30% 59.24% OS 299 37.13% 91.91% 69.08% TA 106 30 50% 87.09% 37.79% BD 317 39.13% 90.50% 72.58% HV 614 26 34% 86 36% 61 56% 5 AT 664 19 44% 83.30% 6063% OS 295 41.53% 92.83% 70.77% TA 112 38.23% 89.43% 43.97% HV 370 24 91% 8360% 59.72% AT 398 23 54% 82 26% 58.25% OS 307 38.53% 91.13% 70.58% CR 129 12.97% 83.24% 36 94% 7 HV 625 23.24% 82.34% 60.17% AT 595 23 94% 82 34% 62 74% OS 235 37.73% 92.68% 72.96% TA 96 25 65% 90 27% 40 23% OS 311 37.91% 91.85% 66.79% TA 92 31.07% 89.58% 40.32% OS 210 39.61% 92.80% 70.22% TA 147 3099% 87 64% 37 79% 10 OS IK8 31.30% 92.97% 64.34% TA 90 24.12% 86 06% 37 89% 11 OS 287 35.98% 87.07% 61.05% TA 74 27.42% 90.49% 39 98% 12 OS 174 43.66% 91.85% 63.87% TA 102 23 66% 86 64% 37.53% 13 TA 104 25.17% 85 16% 44 78% 14 TA 108 32.12% 9244% 47 76% 15 TA 113 18 34% 88 77% 36.13% 16 TA 69 31.05% 8561% 37.87% 17 TA 82 27.67% 87.67% 40 31% 18 TA 58 30.93% 89 12% 38 11% 19 TA 239 17.54% 78 84% 36 22% 20 TA 133 23.32% 83.71% 40 41% 21 TA 306 10 30% 70.19% 30 49% 57 Supplementary Table S3. Unique imperfect SSRs presented in the nuclear genome of .V capillata. ID Contig number Motif Type Start End Length 1 contig_907 (TTGACA)n hexa 214,212 214,373 164 2 contig_ 1647 (TCTGGT)n hexa 65,778 65,902 128 3 contig_4419 (TATGTC)n hexa 22,183 22,302 122 4 contig_3408 (GGAATT)n hexa 106,465 106,560 102 5 contig_232 (TTCGGGG)n hepta 1,513,940 1,514,259 326 6 contig_2603 (TAGGGTC)n hepta 144,295 144,587 293 7 contig_2186 (GGCTTAG)n hepta 45,772 46,082 291 8 contig 1032 (ACCCCGG)n hepta 99 351 243 9 contig_3460 (CCGGCCC)n hepta 201,077 201,281 200 10 contig 1374 (AGACCCT)n hepta 1,247 1,443 197 11 contig_1879 (AGGGCTC)n hepta 66,884 67,075 186 12 contig_469 (ATGGGCT)n hepta 376,522 376,703 182 13 contig_65 (CATACAA)n hepta 55,721 55,893 177 14 contig_749 (TCCGGAG)n hepta 206,925 207,093 166 15 contigl 119 (CTGCTCC)n hepta 210,554 210,719 162 16 contig_l 11 (TCCGACA)n hepta 184,518 184,650 154 17 contig_2550 (TTccTCG)n hepta 1,601 1,754 154 18 contig_1485 (CAGAGCC)n hepta 1,262,462 1,262,611 150 19 contig 8930 (GTTTAAG)n hepta 105,811 105,961 146 20 contig_1419 (TGCCGGC)n hepta 150,723 150,869 145 21 contig_2527 (CGCCTGA)n hepta 87,059 87,191 139 22 contig_2073 (TATGTTA)n hepta 45,233 45,373 137 23 contig_737 (AAGCAAT)n hepta 6,932 7,071 134 24 contig_3531 (AAGGTTT)n hepta 109,441 109,576 133 25 contig 8761 (ACTAGCT)n hepta 2,643 2,773 132 26 contig_3660 (GCGTGCT)n hepta 174,080 174,180 128 27 contig_779 (TAGCACA)n hepta 603,552 603,677 128 28 contig_135 (TAACTTG)n hepta 326,515 326,635 125 29 contig_1003 (GTGGAAG)n hepta 5,422 5,544 119 30 contig_3267 (TTCTCAA)n hepta 43,227 43,335 115 31 contig_2939 (CCGGAGG)n hepta 69,511 69,627 112 32 contig_836 (CTCAACA)n hepta 249,536 249,642 110 33 contig_422 (CCCCGAG)n hepta 287,529 287,635 108 34 contig_4880 (TTAATGT)n hepta 38,070 38,177 106 35 contigl 157 (GATCTTG)n hepta 387,365 387,474 105 36 contig_2842 (GGCGGTT)n hepta 94,531 94,636 104 37 contig_5593 (TCTCTTA)n hepta 16,636 16,741 104 38 contig_3011 (GCGAGTG)n hepta 12,163 12,268 103 39 contig_3225 (TTTCATC)n hepta 204,660 204,767 103 40 contig_2135 (GCCGCCAA)n octa 33,373 33,605 249 41 contig_3596 (CCCGCCGG)n octa 199,857 200,074 226 42 contig_3263 (TTTATTAT)n octa 181,244 181,447 205 43 contig_1430 (GCATCGCC)n octa 46,401 46,529 137 44 contig_2693 (CGCCCGCT)n octa 418,746 418,850 119 58 45 contig_1763 (GTATGGA)n octa 111,986 112,100 119 46 cont¡g_2308 (GTTTGTGA)n octa 156,752 156,863 116 47 contig_1530 (TTTTATCA)n octa 150,870 150,880 108 48 contig_316 (CGCGGCGC)n octa 326,992 327,093 105 49 contig_3038 (AGTTCACAC)n nona 162,707 162,927 215 50 contig_5760 (GCTATGTGA)n nona 103,511 103,659 144 51 contig_2775 (AACTGTGTG)n nona 55,320 55,440 118 52 contig_700 (CACATAGCT)n nona 221,391 221,504 106 53 contig_177 (TGTGAACTG)n nona 719,771 719,877 104 54 contig_836 (AGGTTCTGGA)n deca 465,813 465,936 127 55 contig_2705 (AACTAACCCT)n deca 28,052 28,166 112 56 contig_1324 (TCCAGAACCT)n deca 707,617 707,720 109 57 contig_3610 (GTTCCGGAAG)n deca 532,860 532,965 104 58 contig_645 (TTTTTCTGAA)n deca 342,606 342,705 104 S9 Supplementary Table S4. List of samples used in the molecular analyses Taxon Voucher No in Locality Latitude Longitude Altitude Data Collector KRA ALTB S. vapillaia 0496240 Kyrgyzstan, N 42°4'4I.29" E 75°3*I2.43M 2387 m 07 07 2018 M. Nobis, central Tian Shan. E.KIichowska. ca. 47 km NNE of Chaek A.Wrobel, A Nowak S. rich lemma 003756 Kazakhstan. N 45n30'38 88" E 73°29'38 34" 451 m 22 05 2014 M Nobis, Motynkum distr., P.Gudkova between Kashkanteniz and Mynaral S. lessingtana 003728 Kazakhstan. N43°I9,2I" E 75°55’50" 755 m 18.05.2014 M.Nobis, Zhambyl distr, P.Gudkova 10 km E ofTargap S. hvplapolamica 003747 Kazakhstan. N 44° 13 40 4" E 77°42'27.6" 966 m 22.05.2014 M Nobis, Kerbulak distr. P Gudkova 2 km NE of Karlygash S. korshinskyi 001767 Russia. N 51 °36'59 88" E 82°33'59 T 277 m 01 06 2016 E Punina Altayskiy kray. Krasnoshchyokovsky distr., between Kurya and Krasnoshchekovo 60 Supplementary Table S5. Species names and GenBank accession numbers for the reference mitochondrial (mt) and chloroplast (cp) genomes used in this study. Taxon Type Sequence length (bp) GB accession number Trlpsacum dactyloides mt 704,100 NC 008362.1 Hordeum vidgare mt 525,599 AP017300.1 Zea mays mt 569,630 NC 007982.1 Triticum aestivum mt 452,526 MH051716.1 Eleusine indica mt 520,691 NC 040989.1 Sorghum bicolor mt 468,628 NC 008360.1 Oryza saliva mt 637,692 JF281153.1 Aegilops spelloides mt 476,091 NC 022666.1 Alloteropsis semialata mt 442,063 MH644808.1 l.olium perenne mt 678,580 JX999996.1 Sacchanan officinarum, chromosome 1 mt 300,784 NC 031164.1 Saccharum officinarum, chromosome 2 mt 144,698 LC107875.1 Slipu arabica cp 137,757 NC_037024 Stipa borysthenica cp 137,825 NC 037025 Stipa capilla ta cp 137,830 MG052598 Slipa capillala cp 137,835 MG052599 Stipa caucasica cp 137,798 MG052600 Stipa beptapotamica cp 137,829 MH918066 Stipa liohenackeriana cp 137,753 NC 037028 Stipa hymenoides cp 137,742 NC 027464 Stipa jagnobica cp 137,827 MG052604 Stipa lessingiana cp 137,829 NC 037030 Stipa lipskyi cp 137,854 KT692644 Slipa magnifica cp 137,848 MG052606 Stipa narynica cp 137,854 MG052607 Stipa orientalis cp 137,822 NC 037033 Stipa ovczinnikovii cp 137,874 NC 037034 Slipa pennata cp 137,825 MG0526I0 Stipa purpurea cp 137,370 KT983629 Stipa richteriana cp 137,831 MG0526I2 Stipa roylei cp 137,606 MT094322 Slipa tianschanica cp 137,847 MG052613 Stipa x akuca cp 137,850 MG052614 Stipa x brevica llosa cp 137,850 MG052615 Slipa zalesskii cp 137,836 MG052616 61 Supplementary information SI. CTAB large-scale DNA extraction protocol. Before extraction: 1) Prepare 5 ml of fresh (no more than 1 day old) CTAB buffer: • 100 mM Tris-HCl pH 8.0 = Add 500 uL IM Tris-HCl. • 20 mM EDTA = Add 200 uL 0.5N EDTA. • 1.42 M NaCL = Add 415 mg NaCI. • 55 mM CTAB = Add 100 mg CTAB. • 0.5 mM PVP = Add 100 mg PVP. • Add 4.3 ml water for 5.0 mL total volume. • Directly before use, add 4 uL PME to each 2 mL of CTAB buffer. 2) Prepare: • 24:1 chloroform:isoamyl alcohol. • Fresh 70% ethanol. • Isopropanol for precipitation. 3) Place metal spatulas with dry ice to freeze. Be sure that 37°C and 65°C water baths are prepared. Extraction steps: 1) Grind 0.2 g tissue in pre-chilled mortar and pestle under liquid nitrogen until it is a fine powder. With a cold spatula, quickly transfer the powder to a 15 ml falcon tube. 2) Add 2 ml of CTAB + PME to the tube before the tissue thaws and invert repeatedly (do not vortex). 3) Place sample in 65°C water bath and incubate for 20 minutes, inverting every 5 minutes to insure tissue is completely homogenized and mixed with CTAB buffer. 4) Centrifuge at > 4000g for 15 minutes to pellet plant tissue, and transfer supernatant to new 15 ml tube. 5) To supernatant in new tube, add 2 ml 24:1 chloroform: isoamyl alcohol. Mix and invert continuously for 5-10 minutes. 6) Centrifuge at > 4000g for 15 minutes to create pellet. Being careful to not disrupt interface layer, transfer supernatant to new 15 ml tube. 7) Add Qiagen RNase A to the transferred supernatant to achieve 1.8% of total volume. Incubate at 37°C for 15-30 minutes (with occasional mixing). 8) After RNase digestion, add 2 ml 24:1 chlorofornrisoamyl alcohol. Mix and invert continuously for 5-10 minutes. 9) Centrifuge at > 4000g for 15 minutes to create pellet. Being careful to not disrupt interface layer, transfer supernatant to a new 15 ml tube. 10) To precipitate DNA, add 2 ml of room temperature isopropanol to the surface of the supernatant and gradually rock to mix. Continue mixing gently for 5-10 minutes, at which point you should see the DNA precipitate out of solution. You can continue to precipitate DNA by placing at -20°C for 1 hour, but most of the high molecular weight DNA precipitates in the first few minutes. 62 11) After precipitation, centrifuge at > 4000g for 15 minutes to pellet DNA. Pour out the alcohol carefully, being sure the white pellet remains at the bottom of the tube. 12) Add 5 ml 70% ethanol, mix by inverting the tube, spin for 5 min, and pour out the alcohol. 13) Repeat the 70% ethanol wash. 14) Perform a final brief spin to collect residual ethanol. Remove this carefully with a p20 pipet. 15) Air dry the pellet for no more than 5-10 minutes on the bench top. 16) Resuspend the pellet in 200 uL of EB buffer on the bench for 1-2 hours, then overnight at 4°C. 63 Supplementary information S2. Software tools, versions, settings and parameters used in the study. (1) AliView: version 1.26, default parameters; (2) Assembly-stats: version 1.0.1, default parameters; (3) Augustus: versions 3.2.3, 3.3.3, 3.3.4, default parameters; (4) BEAST2: version 2.6.3, default parameters; (5) BLAST: version 2.10.0, default parameters; (6) BUSCO: version 4.0.6, dataset poales odblO, default parameters; (7) Canu v.2.0: version 2.0, parameters: (minReadl.ength=5000 minOverlapl.ength 5000 genomeSize=0.01 m); (8) Circlator: version 1.5.5, default parameters; (9) Cp-hap: default parameters; (10) FALCON: version 0.2.5, default parameters; (11) Flye: version 2.4 for the nuclear genome assembly, parameter: (--genome-size 600m); (12) Flye: version 2.7.1 -bl590 for the mitochondrial and chloroplast genome assemblies, parameter for the first run: (-genome-size 0.55m); parameters for the second run: (-genome-size 0.55m —trestle — min-overlap 10000); (13) Krait: version 1.3.3, default parameters; (14) MAFFT: version 7.471, parameters: (—ep 0 —op 12 —lexp 3 —genalpair —maxiterate 1000); (15) Minimap2: version 2.17-r941, default parameters; (16) Purge Haplotigs: version 1.1.1, default parameters; (17) Qualimap: version 2.2.2, default parameters: (18) RaGOO: version 1.1, default parameters; (19) RepeatlMasker: version 4.1.0, custom library, default parameters; (20) RepeatModcler: version 2.0.1, default parameters; (21) Samtools: version 1.9, parameters: (view -F 4 -q 20); (22) SequelQC: version 1.1.0, parameters: (-k -p a); (23) SPAdes: version 3.14.1, parameter: (-k 95); (24) Unicycler: version 0.4.8, default parameters. 64 Chapter 3: Evidence for extensive hybridisation and past introgression events in feather grasses using genome-wide SNP genotyping The chapter includes the third published article. The research reports the first large-scale study on hybridisation and introgression within five feather grass species across several hybrid zones in Russia and Central Asia. The novelty of the study is the usage of new knowledge received with the draft genome project (Article 2). Specifically, the research reveals the potential age of origin of the studied species as well as the time of diversification within these taxa. Additionally, the work detects past reticulation events between some of the studied species. Moreover, the article unequivocally demonstrates that feather grasses may be a suitable genus to study hybridisation and introgression events in nature. 65 Article 3 Title: Evidence for extensive hybridisation and past introgression events in feather grasses using genome-wide SNP genotyping Journal: BMC Plant Biology, volume 21, Article number: 505 (2021) 2-year impact factor: 4.215 5-year impact factor: 4.960 The Ministry of Science and Higher Education of Poland: 140 points DOI: https://doi.org/10.1186/s12870-021-03287-w 66 Baiakhmetov ef al. BMC Plant Biology (2021)21:505 https://doi.Org/10.1186/si 2870-021 -03287-w BMC Plant Biology RESEARCH Open Access Evidence for extensive hybridisation and past introgression events in feather grasses using genome-wide SNP genotyping Evgenii Baiakhmetov1,2’, Daria Ryzhakova2,3, Polina D. Gudkova2,3 and Marcin Nobis1,2’ Abstract Background: The proper identification of feather grasses in nature is often limited due to phenotypic variability and high morphological similarity between many species. Among plausible factors influencing this issue are hybridisation and introgression recently detected In the genus. Nonetheless, to date, only a bounded set of taxa have been investigated using Integrative taxonomy combining morphological and molecular data. Here, we report the first large-scale study on five feather grass species across several hybrid zones in Russia and Central Asia. In total, 302 specimens were sampled In the field and classified based on the current descriptions of these taxa. They were then genotyped with high density genome-wide markers and measured based on a set of morphological characters to delimitate species and assess levels of hybridisation and Introgression. Moreover, we tested species for past introgression and estimated divergence times between them. Results: Our findings demonstrated that 250 specimens represent five distinct species: 5. baicalensis, 5. capillata, S. glareosa, S. grandis and S. krylovii. The remaining 52 Individuals provided evidence for extensive hybridisation between 5. capillata and S. baicalensis, S. capillata and S. krylovii, 5. baicalensis and S. krylovii, as well as to a lesser extent between S. grandis and 5. krylovii, 5. grandis and 5. baicalensis. We detected past reticulation events between S. baicalensis, S. krylovii, S. grandis and inferred that diversification within species S. capillata, S. baicalensis, S. krylovii and S. grandis started ca. 130-96 kya. In addition, the assessment of genetic population structure revealed signs of contemporary gene flow between populations across species from the section Leiostipa, despite significant geographical distances between some of them. Lastly, we concluded that only 5 out of 52 hybrid taxa were properly Identified solely based on morphology. Conclusions: Our results support the hypothesis that hybridisation Is an important mechanism driving evolution In Stipa. As an outcome, this phenomenon complicates Identification of hybrid taxa In the field using morphological characters alone. Thus, integrative taxonomy seems to be the only reliable way to properly resolve the phylogenetic Issue of Stipa. Moreover, we believe that feather grasses may be a suitable genus to study hybridisation and introgression events in nature. Keywords: Feather grasses, Hybridisation, Introgression, Integrative taxonomy, Genome-wide genotyping, DArTseq, Divergence-time estimation, Population structure "Correspondence: evgeniibaiakhmetovradoctoral.uj.edu.pl; m,nobis@uj.edu Background The proper delimitation of species plays an important •’Research laboratory'Herbarium', National Research Tomsk State , . . ,, . . ,. , . , . University, Lenin 36 Ave., 634050Tomsk, Russia role in taxonomy as wel> as >n studles related to sPe‘ Full list of author information is available at the end of the article ciation, biogeography and ecology, leading to effective conservation and management of biodiversity. In the BMC C The Authorfs) 2021, corrected publication 202 Open Access: his article is licensed under a Creative Commons Attribution 40 International License, which permits use. sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this artide are induded in the artide's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use. you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http7/creativecommonsorg/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http-7/creativecommons.org/publicdomain/zero/l .0/) applies to the data made available in this artide, unless otherwise stated in a credit line to the data. 67 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 2 of 20 last two decades, traditional approaches relying mostly on morphological features have been supplemented by molecular data that boosted the discovery of new species. Although one estimate of the number of plant species is around 298,000 [1], recently it has been shown that the plant kingdom is comprised of at least 374,000 taxa [2]. Nowadays, many systematicists emphasise the need to apply multidisciplinary data, so-called integrative approaches or integrative taxonomy [3]. For instance, information from a variety of disciplines, e.g., morphology, biochemistry, cytogenetics and ‘omics studies, increases the reliability and validity in identifying taxa [4—6]. To date, among the molecular methods, DNA bar-coding has been a widely utilised tool to identify taxa at different levels aiming not only to facilitate revisionary taxonomy, but also to broaden our understanding of molecular phylogenetics and population-level variations [7-9]. Among standard plant markers, chloroplast regions rbcL and matK and the nuclear internal transcribed spacer (ITS) locus have been proposed for DNA barcoding of land plants [10, 11]. Additionally, several non-coding plastid regions have been suggested as supplementary loci where further resolution is required [10]. Stipa is one of the largest genera in the grass subfamily Pooideae (Poaceae), currently comprising nearly 150 cool-season species with C3 photosynthesis common in Eurasia and North Africa [12, 13]. Based on ITS and the plastid trnK region, the genus has been proven to be monophyletic [14, 15]. Nonetheless, the traditional barcodes are not able to validate the sectional subdivision in Stipa proposed, e.g., by Tzvelev [16, 17] or Freitag [18]. Recently, the nuclear intergenic spacer (IGS) region [19] and marker sets derived from whole chloroplast genomes of 19 species [20] were proposed for phylogenetic studies of feather grasses. Although these markers are more phylogenetically informative in comparison to the previously used barcodes, they are still unable to discriminate all taxa, causing unresolved nodes in the reconstructed trees [19, 20]. One of the plausible explanations for this unresolved branching in Stipa is that many feather grasses are of hybrid origin [16, 21-23]. Presently, hybridisation is considered to be widespread among at least 25% of plant taxa, mostly the youngest species [24], This phenomenon is often accompanied by introgression via repeated back-crossing to one or both parental species that can lead to diversification and speciation [25, 26]. In grasses, hybrid speciation has been explicitly studied at the genome level, e.g., in Triticum [27] and Brachypodium [28], Nonetheless, previously many hybrids and introgressed individuals were characterised exclusively based on morphology that limited their successful identification in nature [29, 30]. In addition, hybridisation may lead new organisms not only to intermediate traits of parental species but also to extreme, or transgressive, phenotypes [31] that complicate their proper taxonomic treatment. In feather grasses, the hypothesis of hybrid origin of some species was initially tested using multivariate morphological analyses [23, 32] and, more recently, applying molecular markers among genetically closely related species in the Stipa heptapotamica complex [33] as well as within two genetically distant species, S. krylovii and S. bungeana [34]. Furthermore, due to the usage of integrative taxonomic approaches, it was shown that some Stipa taxa, previously assigned to S. richteriana and S. grandis, appeared to be cryptic species [33, 35]. Thus, taking into account that ca. 30% of feather grasses could be of hybrid origin [13] and that cryptic species are present, integrative taxonomy seems to be the only reliable way to properly resolve the phylogenetic issue of Stipa. Importantly, the advent of next generation sequencing technologies, primarily Illumina, and the continuously reducing sequencing cost have facilitated the implementation of genomic data in studies of non-model organisms. For instance, high-throughput techniques based on restriction enzymes, e.g., RADseq [36] and genotyp-ing-by-sequencing [37], which have been foremost used in agricultural species [38], are currently widely applied in phylogenetics and studies related to hybridisation in many wild plant genera with little or no previous genomic information [39—41]. Recently, a promising result has also been demonstrated in Stipa where the usage of the DArTseq technique resulted in an increased number of markers that was several 100-fold higher than in the previous genomic studies [34]. During field studies on steppe communities, we observed high morphological variability in populations of genetically related plants. In particular, we noticed that some specimens of feather grasses representing S. capillata, S. grandis and S. krylovii seemingly share mixed morphological characteristics between these species, while taxon S. baicalensis is frequently observed within populations of the aforesaid taxa and resembles an intermediate phenotype between S. grandis and S. krylovii or S. capillata and S. krylovii. The above-mentioned species have wide distribution ranges (Fig. 1 and Supplementary Table SI). Specifically, S. capillata is the most widespread taxon within the genus, grows on the dry grasslands and is common in Siberia, Western Asia and is also present in a limited number of refugia in Europe and North Africa. Two species, S. baicalensis and S. grandis, share similar ranges in southwestern Siberia, in the Baikal region, in the south part of Zabaykalsky Krai, in Mongolia and in northeastern China: S. baicalensis is also present in the south part of the Russian Far East, whereas S. grandis 68 Baiakhmetov etol. BMC Plant Biology (2021)21:505 Page 3 of 20 occurs in Central China and Tibet. Finally, S. krylovii grows in Siberia, Mongolia, China, Northern Nepal, Southern Tajikistan, Eastern Kazakhstan and Eastern Kyrgyzstan [13,42], We hypothesise that the observed variability in S. baicalensis, S. capillata, S. grandis and S. krylovii is due to the presence of interspecific hybrids and that may lead to species misidentification based on the current descriptions of these taxa [16, 43-45). Thus, in the current study, we aim to use an integrative taxonomy approach to (1) delimitate species and test if S. baicalensis is a hybrid between S. grandis x S. krylovii or S. capillata x S. krylovii; (2) assess levels of hybridisation and introgression (if present) between the examined taxa and populations at the molecular level; (3) estimate divergence times between the studied taxa; (4) obtain insight 69 Fig. 1 The general distribution map of (a) 5. baicalensis (yellow), S. capillata (red), 5. grandis (green), S. krylovii (blue) and sampling locations (b) in East Kazakhstan and southwestern Siberia (Russia), (c) in southeastern Siberia and (d) in Eastern Kyrgyzstan.The dashed lines indicate hypothetical borders. The coloured circles depict species found in the numbered locations. The exact coordinates of the locations are presented in the Supplementary Table SI Baiakhmetov et al. BMC Plant Biology (2021) 21:505 Page 4 of 20 into the extent of hybridisation between these species at the morphological level and (5) assess whether morphological characters can be used to identify hybrids in the field. Results DNA-based species delimitation The DArTseq technique was applied to obtain a total of 8660 SNP markers to infer the genetic structure of 302 Stipa specimens. Firstly, analyses of genetic clustering with an unweighted pair group method using arithmetic average (UPGMA) and fastSTRUCTURE revealed five major clades corresponding to morphospecies S. glare-osa, S. capillata, S. grandis, S. krylovii and S. baicalensis (Fig. 2). According to the fastSTRUCTURE analysis, the first and the fourth clades consisted exclusively of pure specimens of S. glareosa and S. krylovii, respectively. The remaining clades beside pure specimens of S. capillata, S. grandis and S. baicalensis included hybrid individuals. In particular, the second clade comprised pure specimens of S. capillata and the admixed individuals S. capillata x S. baicalensis and S. capillata x S. krylovii. The third cluster consisted of pure specimens of S. grandis and hybrids S. grandis x S. krylovii and S. grandis x S. baicalensis. The fifth clade included pure specimens of S. baicalensis and the admixed individuals S. baicalensis x S. krylovii. In total, fastSTRUCTURE inferred 52 individuals with an admixture of two genetic clusters including an exception of S. capillata x S. baicalensis (0454631) that had a minor proportion (0.02) of a third cluster representing S. krylovii. Among pure individuals only one specimen of S. krylovii (0454646) had an insignificant admixed proportion (0.03) of S. grandis. Noteworthy, the vast majority of admixed individuals (49 or 94%) had a proportion of membership in the range from 0.46 to 0.54 indicating FI hybrids or later generations of hybrids that have no back-crossing to the parental species. The remaining admixed samples represented: (1) one individual (0477009) was formed by a 0.78-0.22 admixture between S. baicalensis and S. krylovii evidencing a first-generation backcross (Fix S. baicalensis), (2) one individual (000948) was shared between S. grandis (0.88) and S. krylovii (0.12) indicating a second-generation backcross (first-generation backcross x S. grandis), (3) one individual (000956) was admixed between S. grandis (0.64) and S. krylovii (0.36) that may suggest a first-generation backcross (FI x S. grandis) or a more complex backcross to S. grandis via different intermediate combinations. A consistent result was also found with a principal coordinates analysis (PCoA). The first three axes explained 29.6, 19.9 and 19.2% of the total genetic divergence within the studied taxa, respectively. According to the PCoA, pure individuals were grouped into five Fig.2 The UPGMA dendrogram (at the top) aligned with the best supported fastSTRUCTURE model K=5 (on the bottom). The genetic distance was calculated using the Jaccard Similarity Coefficient (y-axis, top). Individuals are represented by coloured bars according to the proportion of membership (y-axis, bottom) of a genotype to the respective cluster 70 Baiakhmetov ef at. BMC Plant Biology (2021) 21:505 Page 5 of 20 markedly differentiated groups correspondingly to their taxonomic classifications (Fig. 3; an interactive version of the three-dimensional plot can be accessed via https:// plot.ly/~eugenebayahmetov/40/). The remaining hybrids FI had intermediate positions between their parental species. Two hybrid individuals S. grandis x S. krylovii (000948 and 000956) were grouped closer to 5. grandis reflecting a higher proportion of membership with the first taxon established earlier with fastSTRUCTURE. Similarly, an admixed individual S. baicalensis x S. krylovii (0477009) with the proportion of 0.78 and 0.22 was closer to S. baicalensis. Hybrid generation identification The NewHybrids analysis revealed a more complex pattern of hybridisation than it was inferred with fastSTRU CTURE. Among 16 admixed specimens of S. baicalensis x S. krylovii, previously assigned as FI, six individuals with posterior probabilities (PP) in a range of 0.84 and 1.00 were identified being F2 (FI x FI) hybrids suggesting that FI hybrids are able to reproduce further. One specimen, S. baicalensis x S. krylovii (0477009), was proven to be a first-generation backcross (Fix S. baicalensis) having PP of 0.81. In addition, five mixed individuals had PP between two categories (FI and F2 hybrids) in a range of 0.22-0.78 suggesting uncertainty in the assignment (Fig. 4a and Supplementary Table S2). These mixed assignment individuals may represent a more advanced hybrid generation than can be detected by NewHybrids. Within 14 hybrids of S. capillata x S. krylovii, previously assigned to FI hybrids, we detected six individuals of FI (PP of 0.87-1.00) and one specimen was identified as an F2 hybrid with PP of 0.83. The remaining seven individuals had mixed assignments in a range of 0.39-0.73 for the FI class and 0.27-0.61 for the F2 class, respectively (Fig. 4b). The analysis also demonstrated that 13 out 14 hybrids of S. capillata x S. baicalensis were FI (PP of 0.86-1.00) and one individual remained unclassified sharing PP (0.54 and 0.46) between FI and F2 categories (Fig. 4c). Among S. grandis x S. krylovii only one FI hybrid was detected (PP of 0.91), two individuals were assigned to the F2 class (PP of 0.83 and 1.00) and two specimens were classified as first generation backcrosses (Fix S. grandis) having PP of 0.88 and 0.99. Although the specimen 000948 was inferred to be an FI backcross, it is more plausible that it represents rather a second-generation backcross established by fastSTRUCTURE due to NewHybrids being unable to detect more advanced backcrosses than FI. Additionally, one individual had mixed assignments between the FI (PP of 0.27) and F2 (PP of 0.72) classes (Fig. 4d). Finally, the only hybrid detected for S. grandis Fig. 3 The PCoA plot based on genetic distances between samples, a The plot of the two principal axes, b The plot of the three principal axes. The pie charts represent the proportions of membership established by fastSTRUCTURE for the best K=5 71 Baiakhmetoy et al. BMC Plant Biology (2021)21:505 Page 6 of 20 Fig. 4 The assignment of Stipa taxa into four hybrid classes according to the posterior probabilities (y-axis) inferred in NewHybrids a S. baicalensis x S. krylovii, (b) S. capillata x S. krylovii, (c) S. capillata x S. baicalensis, (d) S. grandis x S. krylovii, (e) S. grandis x S. baicalensis. Hybrid classes are coloured by black (FI hybrid), grey (F2), cyan (backcross to the first parental species, BC to parent 1) and pink (backcross to the second parental species, BC to parent 2) x S. baicalensis appeared to be a first generation hybrid with PP of 1.00 (Fig. 4e). Testing for introgression A total of 6894 SNP markers were used to test for reticulation events between the studied taxa. Due to five different admixed combinations detected by fastSTRUCTU RE and NewHybrids, we tested all possible four-species combinations regardless of their phylogenetic positions (Fig. 2). The results of the/, statistic suggest no gene flow between S. capillata and the remaining species because of negligible deviations from the expected 50/50 ratio of BABA/ABBA patterns and the lowest Z-scores of any tests (Table 1). This finding disagrees with the presence of contemporary hybrids S. capillata x S. krylovii and S. capillata x S. baicalensis inferred with fastSTRUCTU RE and NewHybrids. Nonetheless, it can be explained by the fact that all identified admixed individuals were excluded from this analysis. On the other hand, introgression events were suggested between S. grandis and Table 1 Test for introgression between the studied species using 6894 SNPs No B C D nBABA nABBA U Z-score 1 S. capillata S. grandis S. krylovii 11 11 -0.000094 -0.187 2 S. capillata S. grandis S. baicalensis 15 17 -0.000284 -0.494 3 S. capillata S. krylovii S. baicalensis 13 15 -0.000190 -0.293 4 S. grandis S. krylovii S. baicalensis 23 64 -0.006000 -4.570 5 S. grandis S. capillata 5. baicalensis 130 17 0.016400 10.000 6 5. grandis S. krylovii S. capillata 11 166 -0.022400 -9.090 7 S. krylovii S. baicalensis S. grandis 64 30 0.004870 3.530 8 5. krylovii S. capillata S. grandis 166 11 0.022500 8.910 9 S. krylovii S. baicalensis S. capillata 15 136 -0.017600 -10.200 10 S. baicalensis S. krylovii S. grandis 23 30 -0.001130 -1.260 11 S. baicalensis S. capillata S. grandis 130 15 0.016700 10.400 12 S. baicalensis S. krylovii S. capillata 13 136 -0.017800 -10.800 Outgroup (A) for all tests was S. glareosa; nBABA, number of BABA patterns; nABBA, number of ABBA patterns. Standard error in all tests was <0.01. Negative f, and Z-score< -3 indicate gene flow between B and C, positive f4 and Z-score > 3 suggest reticulation events between B and D. 72 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 7 of 20 S. baicalensis (combinations 5 and 11), S. grandis and S. krylovii (combinations 4, 6, 7 and 8), S. krylovii and S. baicalensis (combinations 9 and 12). Additionally, when S. grandis, S. krylovii and S. baicalensis were analysed together (combinations 4, 7 and 10) the ratio of BABA/ ABBA patterns were either almost equal (combination 10) or relatively lower (combinations 4 and 7) compared to the other tests that indicated gene flow among these species. One potential explanation is that these species are involved in introgression at the same rate, which theoretically cancel out each other. Population differentiation A total of 3483 SNP markers were used to investigate the genetic differentiation in populations of S. baicalensis, 6288 SNPs in S. capillata, 4635 SNPs in S. grandis and 6912 SNPs in S. krylovii (Supplementary Fig. SI). The pairwise Fst values demonstrated strong differentiation among four populations of S. baicalensis, while the results of STRUCTURE and PCoA revealed two and three genetic clusters, respectively (Fig. 5a and Supplementary Fig. S2), where populations 1 and 4 are merged (Fst of 0.32, Supplementary Table S3) regardless of the fact that the distance between them is more than 1000km. Additionally, the second most likely K according to STRUCTURE was K = 3 indicating that this number of clusters is also a likely option. Relatively strong differentiation was also shown for populations of S. capillata with an exception of populations 5 and 6 with a moderate Fst value of 0.13, suggesting potential gene flow. According to the PCoA, almost all individuals were grouped together excluding population 3 and a few specimens of populations 2 and 5. Nonetheless, the first two axes of PCoA explained only 18% of the total genetic divergence within the specimens. On the other hand, STRUCTURE supported K=4 as the best fitting model, while two and nine clusters were also among the probable options (Fig. 5b and Supplementary Fig. S2). Within S. grandis, evidence for weak differentiation was shown for geographically close populations 5 and 6 (Fst of 0.10) as well as for 7 and 8 (Fst of 0.08), while the first two axes of the PCoA explained 23.3% of the variation and revealed the close genetic relationship between geographically distant populations 1 and 2 (Fst of 0.42). In Fig. 5 PCoA plots, best supported STRUCTURE models and localities of the studied populations across four species, a S. baicalensis b S. capillata. c S. grandis. d S. krylovii 73 Baiakhmetov et al. BMC Plant Biology (2021)21:505 Page 8 of 20 addition, based on the second axis of the PCoA, populations 3 and 4 were distant to each other as well as to the remaining populations (Fig. 5c). Further, the STRU CTURE analysis suggested that K = 2 was the most probable number of separate clusters within S. grandis (Supplementary Fig. S2), where one pure cluster represented individuals from populations 7 and 8, while 9 out of 13 specimens of population 3 and 4 constituted the second pure group. The rest of S. grandis specimens were admixed between these pure clusters. Lastly, based on the PCoA individuals of S. krylovii were clustered into the two main groups representing population 8 from Kyrgyzstan and the remaining populations from Russia (Fig. 5d). Although few specimens of population 1 were genetically more distant to the other individuals, the second axis of PCoA explained only 5.4% of the total genetic divergence. The pairwise Fst values also supported the division within S. krylovii into two groups indicating the strong differentiation of population 8 (Fst in a range of 0.37-0.46) and a moderate or near to moderate differentiation among the remaining populations (Fst in a range of 0.06-0.19). Similarly, the STRUCTURE analysis revealed two genetic clusters (Supplementary Fig. S2) that described the population structure the best: the first one represented population 8 from Kyrgyzstan, while the second cluster comprised populations from Russia. Divergence-time estimation The SNAPP phylogeny based on 2717 SNP markers among pure individuals revealed largely the same topology as the UPGMA dendrogram, except for the pair S. grandis and S. krylovii that were grouped together, while S. baicalensis was a sister taxon (Fig. 6 and Supplementary Fig. S3). The result suggests that the potential split between S. capillata and three species, namely, S. grandis, S. krylovii and S. baicalensis, took place approximately 1.07 Mya with the 95% Highest Posterior Density interval (HPD) of 1.51-0.71 Mya. The most recent common ancestor for S. grandis, S. krylovii and S. baicalensis was inferred to be 0.79 Mya (95% HPD: 1.12-0.53 Mya), whereas the lowest divergence time of 0.73 Mya was registered for S. grandis and S. krylovii (95% HPD: 1.02-0.48 Mya). The chronogram also indicates that diversification within S. capillata, S. krylovii Fig. 6 Phylogeny and divergence date estimates inferred by SNAPP. Blue coloured trees represent the most probable topology. Numbers at each node represent mean ages of divergence time estimates and the 95% HPD intervals (in the brackets). The black rectangles on the nodes indicate the 95% HPD intervals of the estimated posterior distributions of the divergence times. The red circle indicates the presumed divergence time split set as a reference. The Bayesian posterior probabilities were 1.00 for the nodes with the shown 95% HPD intervals.The scale shows divergence time in Mya 74 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 9 of 20 and S. baicalensis started at relatively the same time, ca. 130-114 kya (95% HPD: 181-74 kya), while S. grandis was established to be the youngest species (96 kya, 95% HPD: 137-63 kya). Lastly, although the topologies within the species had large uncertainty, some nodes had comparatively high Bayesian posterior probabilities (BPP >0.80; Supplementary Fig. S3). In particular, individuals of S. capillata from Kyrgyzstan started to differentiate 97 kya (95% HPD: 137-58 kya), while the well-supported split (BPP of 0.92) between specimens from localities 5 (Russia, the Republic of Khakassia) and 19 (Russia, the Republic of Buryatia) took place 86 kya (95% HPD: 122-53 kya). The most recent common ancestor for individuals of S. krylovii from Kyrgyzstan was inferred to be 62 kya (95% HPD: 94-35 kya), whereas specimens from localities 22 (Mongolia) and 24 (Russia, Zabaykalsky Krai) had the potential split 68 kya (95% HPD: 99-40 kya). Interestingly, specimens of S. baicalensis from localities 11 and 13 (both from Russia, the West shore of Lake Baikal) and individuals from the remaining localities had nearly the same divergence times of 97 kya (95% HPD: 140-60 kya) and 93 kya (95% HPD: 132-58 kya), respectively. Additionally, the potential split between populations of S. grandis from the Republic of Khakassia (locality 5) and the Republic of Buryatia (locality 18) took place 65 kya (95% HPD: 91-39 kya), while the well-supported split (BPP of 0.95) between specimens from localities 23 and 24 (both from Russia, Zabaykalsky Krai) took place 58 kya (95% HPD: 86-34 kya). Morpho-molecular analysis As non-parametric Spearman correlation coefficients did not demonstrate any strong correlation (> 0.90) between the measured variables (Supplementary Fig. S4), we retained all morphological characters (Table 3) for a factor analysis of mixed data (FAMD) and analyses of notch plots. Subsequently, to investigate if the observed phenotypes are congruent with molecular data, we supplemented the result of the FAMD analysis with the genetic clusters inferred by fastSTRUCTURE for the best I< = 5. As a result, the FAMD revealed five markedly differentiated groups (Fig. 7) of operational taxonomic units (OTUs) in accordance with the detected clusters using the SNP data (Figs. 2, 3 and 4). The first four dimensions explained 31.0, 18.9, 10.9 and 8.1% of the total variability, respectively. The first three axes are mainly composed by the contribution of 11 quantitative and four qualitative variables (Supplementary Table S4). Importantly, due to having the genetic clusters assigned by fastSTRUCTURE, the two-dimensional plot revealed the slight overlapping of OTUs belonging to the pure species S. baicalensis and S. krylovii, whereas OTUs representing the admixed specimens S. baicalensis x S. krylovii were present in both clouds of the parental taxa (Fig. 7a). Furthermore, hybrids S. capillata x S. baicalensis were also present in both clouds of the pure species. On the other hand, all the admixed individuals S. capillata x S. krylovii were grouped together with only one parental species, S. capillata. Interestingly, hybrids between S. grandis and S. krylovii were mainly grouped Fig. 7 The factor analysis of mixed data performed on 17 quantitative and six qualitative characters of the five examined species of Stipa. a Plot of the two principal axes, b Plot of the three principal axes. The pie charts represent the proportions of membership established by fastSTRUCTURE for the best K=5 75 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 10 of 20 together with OTUs of S. baicalensis with an exception of the first- and the second-generation backcrosses. The only hybrid detected between S. grandis and S. baicalensis was clustered together with the pure individuals of the former taxon. A more clear dispersal of the pure species can be seen in the three-dimensional plot, where differences between the studied species are explained by the third principal axis (Fig. 7b; an interactive version of the plot available at https://plot.ly/~eugenebayahmetov/42/). Additionally, all combinations of two axes plots for the four dimensions are present in Supplementary Fig. S5. The result of FAMD and notch plots of the quantitative variables demonstrated that the studied species can be differentiated morphologically mainly based on the length of the lower segment of the awn (CollL), the distance from the end of the dorsal line of hairs to the top of the lemma (DDL), the length of the anthecium (AL), the length of the middle segment of the awn (Col2L), the length of the callus (CL), the length of ligules of the internal vegetative shoots (LiglV), the length of ligules of the middle cauline leaves (LigC) and the length of hairs on the top of the lemma (LHTA). In addition, the length of hairs on the lower segment of the awn (HLColl), the length of hairs on the middle segment of the awn (HLCol2) and the length of the callus base (CBL) can aid to distinguish S. glareosa from the remaining taxa (Supplementary Fig. S6). For instance, the notch plot of CollL showed significant differences between means and strong evidence of differing medians within the pure species except the pair S. baicalensis and S. capillata, while the hybrid individuals had mostly intermediate positions between the parental species except specimens of S. capillata x S. baicalensis that were significantly different from the samples of the pure taxa. The differences across all quantitative variables between individuals of the pure species and the hybrids can be better evaluated in the interactive box plots presented in Supplementary File 1. Among the qualitative variables, the main contribution to the axes of the FAMD had the abaxial surface of vegetative leaves (AbSVL), the type of hairs on the top of the anthecium (HTTA), the type of the awn genicula-tion (AG) and the presence of hairs below nodes (PHBN) (Supplementary Table S4). For instance, vegetative leaves with prickles were common in S. glareosa (all samples), S. capillata (61 out of 66 samples), S. capillata x S. krylovii (12 out of 14 samples), S. capillata x S. baicalensis (10 out of 14 samples), less frequent in S. grandis (2 out of 54 samples), S. krylovii (5 out of 81 samples), S. baicalensis x S. krylovii (2 out of 14 samples) and totally absent in S. baicalensis, S. grandis x S. krylovii and S. grandis x S. baicalensis (Supplementary Fig. S7). Thus, based on the results of the molecular analyses and the FAMD combining both phenotypic and SNP data, we were able to differentiate the pure species and the hybrid individuals at the morphological level. We established that using the traditional identification keys [16, 43-45] 71 out of 302 specimens had been misiden-tified, mostly due to their hybrid nature (Supplementary Table S5). In particular, 47 samples previously identified as pure species appeared to be hybrids. The remaining 24 specimens earlier were classified either as hybrids (15 samples) or misleadingly assigned to S. baicalensis (9 samples). Interestingly, in the latter case, all individuals were previously reported from the northeastern part of Kazakhstan [46]. In general, S. baicalensis was the most problematic species for taxonomic identification comprising 54 doubtful samples. Specifically, the above-mentioned specimens from Kazakhstan genetically were proven to be pure S. capillata, 10 specimens appeared to be hybrids S. capillata x S. baicalensis, 8 were S. baicalensis x S. krylovii and 10 were hybrids between S. capillata and S. krylovii. On the other hand, specimens previously identified as hybrids S. baicalensis x S. krylovii (3 samples) and S. grandis x S. krylovii (2 samples) were genetically assigned to the pure species S. baicalensis. Oppositely, specimens morphologically identified as S. krylovii (8 samples), S. grandis (1 sample) and S. capillata (1 sample) appeared to be hybrids S. baicalensis x S. krylovii (6 samples) and S. capillata x S. baicalensis (2 samples), S. grandis x S. baicalensis and S. capillata x S. baicalensis, respectively. Additionally, one specimen S. capillata x S. grandis was proven to be S. capillata x S. baicalensis. At last, one individual S. capillata x S. baicalensis was genetically verified to be the pure species S. capillata. The remaining doubtful samples morphologically were either misleadingly assigned as the pure species or as taxa of hybrid nature. For instance, 4 specimens of S. capillata were hybrids between S. capillata and S. krylovii, while specimens of S. grandis (2 samples) and S. krylovii (2 samples) were shown to be genetically admixed as S. grandis x S. krylovii. In opposite, 9 individuals identified as S. capillata x S. krylovii appeared to be the pure species S. capillata. Interestingly, only 5 out of 52 hybrid taxa were properly identified based on morphology including S. baicalensis x S. krylovii (3 samples) and S. grandis x S. krylovii (2 samples). The only one taxon that had not any questionable individuals was S. glareosa representing the section Smirnovia. Discussion The current understanding of taxonomy and species limits in Stipa is still largely based on morphological characters. Our study highlights the necessity of using molecular tools to properly identify taxa and detect processes underlying speciation. This is of particular 76 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 11 of 20 relevance in hybrid zones where ongoing hybridisation and introgression may lead admixed individuals to phenotypes similar to one of the parental species, complicating identification of such taxa using morphological characters alone. Furthermore, integrative studies suggest that apparently intermediate phenotypes between two species are not necessarily hybrids [47]. On the other hand, as indicated in the present work, although some interspecific hybrids have intermediate characters between parental taxa, their phenotypic traits can also overlap with non-parental species leading to misi-dentification. Given the circumstances, here we utilised an integrative approach combining genome-wide data and morphology to delimitate species and ascertain the extent of hybridisation in feather grasses. The study clearly illustrates that a molecular-based analysis, e.g., such as fastSTRUCTURE, combined with a factor analysis of mixed data, utilising both quantitative and qualitative variables, can largely resolve the problem with species identification in the face of ongoing hybridisation and introgression. Primarily, such an approach allows to visualise and easily trace if the observed phenotypes are congruent with molecular data. Besides that, this approach may aid in selecting a set of traits that can be useful for species identification in the field. Our findings clearly demonstrate that the studied individuals can be clustered into five species groups representing S. capillata, S. baicalensis, S. glareosa, S. grandis and S. krylovii. Thus, here we found no evidence that S. baicalensis is of hybrid origin from S. grandis x S. krylovii or S. capillata x S. krylovii but is instead a genetically distinct species. The general branching of the phylogenetic trees is in good agreement with the current taxonomic classification. In particular, a representative of the section Smirnovia, S. glareosa, was genetically distant to the remaining species from the section Leiostipa. Nevertheless, our result contradicts a previous research, where S. capillata, S. krylovii and S. baicalensis represented one clade and S. grandis was a sister taxon [19], while here, such a sister taxon was S. capillata. The current result is likely more accurate due to applying several thousand SNPs across the genome in comparison to only one nuclear locus in the above-mentioned study. Additionally, we demonstrated that the potential split between S. capillata and the remaining representatives of Leiostipa took place approximately 1.07 Mya, which is similar to our previous estimate of 1.73 Mya based on the nucleolar organising regions [48]. Here, we also reported for the first time that diversification within species S. capillata, S. baicalensis, S. krylovii and S. grandis started ca. ISO-96 kya (95% HPD: 181-63 kya). These ages may correspond to the potential window of time between the Last Interglacial period, which began around 130 kya, and the Last Glacial Period (LGP), which started about 110 kya. Thus, the observed pattern is similar to dispersing events reported for different taxa across the plant kingdom [49, 50] suggesting climatic changes as a feasible factor in the current distribution of feather grasses. Of note, due to the divergence times that were inferred in SNAPP, which uses the multi-species coalescent model ignoring possible introgression, the confidence may be exaggerated. On the other hand, although introgression does cause biased errors in coalescent-based species tree inference [51-53], it should not affect the estimates in the present study, since all admixed individuals were excluded from the analysis. Importantly, the results of molecular analyses were congruent and provided evidence for the existence of hybridisation between pairs S. capillata x S. baicalensis, S. capillata x S. krylovii, S. grandis x S. krylovii, S. grandis x S. baicalensis and S. baicalensis x S. krylovii. The presence of F2 or more advanced hybrid generations suggests that FI individuals are able to reproduce further. This observation is in agreement with our previous findings on hybridisation within the genus [32, 33], where a direct approach proved that a hybrid taxon S. heptapotamica produces fertile pollen grains and is capable of backcrossing to primarily one parental species [33]. Indeed, here we detected backcrosses S. baicalensis x S. krylovii and S. grandis x S. krylovii to their former parental species. Moreover, the analysis of BABA/ABBA patterns among species revealed signs of past introgression between S. baicalensis and S. krylovii, S. baicalensis and S. grandis, S. krylovii and S. grandis. Taking into account the diversification times of these species, we can hypothesise that if such a gene flow occurred it was relatively recent in evolutionary terms and seemingly still present between S. baicalensis and S. krylovii, as well as in pair S. grandis and S. krylovii. Nevertheless, we treat our BABA/ ABBA analysis with caution due to such a test originally being applied in human studies whose genome is available at the chromosome level and the number of SNPs was remarkably higher than used here. Thus, we intend to reassess the past gene flow within these taxa when a more continuous genome will be obtained. Additionally, although here we did not detect any backcrosses between S. baicalensis and S. grandis, it cannot be excluded that such a combination exists in nature, especially since both species are mostly common in Mongolia and China; however specimens from there were not presented in the study. Lastly, our study revealed only unidirectional back-crossing of hybrid taxa either to S. baicalensis or S. grandis, but not to S. krylovii. Similarly, due to the relatively small sample size and the limited number of localities used here, we cannot draw any reliable conclusions concerning possible barriers to gene flow to the latter taxon. 77 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 12 of 20 According to our results, the populations' expansion seemingly started during the LGP. Nonetheless, the assessment of genetic structures revealed signs of contemporary gene flow between populations across all species, despite significant geographical distances between some of them. For instance, populations of S. baicalensis, S. capillata and S. grandis from the eastern part of Khakassia and the southwestern area of Buryatia represented either one genetic cluster or were admixed between two. Among potential explanations are shifting their distributions in response to climate change, or seasonal migration of wild animals and livestock grazing. While we currently do not possess enough data to verify the first assumption, seeds of feather grasses usually are spread naturally by wind or water. On the other hand, they also can be frequently dispersed by the wool of mammals including, e.g., sheep, goats and horses [16, 18]. Although this study was not intended to explore population differences in detail, we believe that our findings regarding gene flow merit further studies in order to better understand the intraspecific variation and relationships among populations as well as to discuss potential consequences of such events. Our results also illustrated a complex association between species at the morphological level. There are usually few phenotypic characters differentiating hybridising species and these characters are often functionally or developmentally correlated [54]. In the present case, although the studied species had a set of distinctive characters, the current identification keys do not provide a solution on distinguishing admixed individuals in the section Leiostipa. As a result, only 5 out of 52 hybrid taxa were properly identified based on morphology. Furthermore, the results demonstrated that the most problematic taxon is S. baicalensis, which was frequently misidentified either with S. capillata or a cross 5. capillata x S. krylovii, while a few individuals were misleadingly assigned to S. grandis x S. krylovii. Additionally, several hybrids between S. baicalensis and S. capillata were determined as pure S. krylovii. Therefore, we believe that the identification keys should be revised in order to properly delimitate pure species and propose a taxonomic treatment for the hybrid taxa identified in the study. Moreover, for a more comprehensive morphological assessment we suggest using scanning electron microscopy that can assist in finding unique ultrastructures among pure and admixed individuals. Although here we highlight that morphological characters alone cannot be used to properly identify hybrids and backcrossed individuals in the field, it is a common issue in plants rather than an exception in feather grasses. For instance, in a study on three species of willows it was shown that based on phenotype only 5% of specimens were classified as introgressed individuals, which was much less than the 19% detected using SNP data [55]. Another research on tropical trees demonstrated that even limited genomic sampling, when combined with morphology and geography, can greatly improve estimates of species diversity for clades where hybridisation contributes to taxonomic difficulties [56]. Recently, an investigation on several pine species using SNP data derived from the DArTseq pipeline revealed that one species, previously considered of hybrid origin, is a genetically distinct species and provided insights into the challenges of solely using morphological traits when identifying taxa with cryptic hybridisation and variable morphology [57]. In grasses, hybridisation and introgression phenomena are still mainly studied in crop species, e.g., rice [58], wheat [59] and sugarcane [60]. To date, we have detected hybrids not only between genetically closely related species [33], but also among genetically distant Stipa taxa [13, 34]. The results present here and our previous findings helps us to shift toward thinking of the Stipa phylogeny as reticulate webs rather than a strictly bifurcating tree. Nonetheless, studying hybridisation in feather grasses is not only of particular interest to plant taxonomists. The presence of parental species, multiple generation hybrids and backcrosses in different proportions in a hybrid zone may indicate renewed sympatry providing important data for studying species boundaries and patterns of speciation [61]. From this point of view, Stipa may be a suitable genus to study these phenomena. Despite the increasing interest in feather grasses at the molecular level [19, 20, 48, 62-65], there is still a lack of substantial knowledge regarding, e.g., chromosome numbers of admixed and pure taxa in hybrid zones, fertility of pollen grains in FI and later generation hybrids and backcrosses and genomic information related to specific loci contributing to reproductive barriers. We believe that only an integrative approach combining the aforesaid data can properly interpret evolutionary patterns and processes in feather grasses. Conclusions In the current study we revealed a complex taxonomic issue in feather grasses with variable morphology exhibited due to extensive hybridisation. Based on SNPs derived from genome-wide genotyping we detected five genetic groups representing separate morphospecies and showed that S. baicalensis is a genetically distinct species instead of a taxon of hybrid origin as it was previously hypothesised. We demonstrated the presence of FI hybrids between S. capillata x S. baicalensis, S. capillata x S. krylovii, S. grandis x S. krylovii, S. grandis x S. baicalensis, S. baicalensis x S. krylovii and F2 individuals in 78 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 13 of 20 S. capillata x S. krylovii, S. grandis x S. krylovii, S. baicalensis x S. krylovii indicating low levels of reproductive isolation in these species. We also discovered a few backcrosses S. baicalensis x S. krylovii and S. grandis x S. krylovii to their former parental species suggesting possible introgression among the taxa. Furthermore, we detected reticulation events between S. baicalensis and S. krylovii, S. baicalensis and S. grandis, S. krylovii and S. grandis. On the other hand, we revealed signs of contemporary gene flow between populations of the species from the section Leiostipa. Another important outcome of the research is divergence date estimates inferred at the species and population levels. Here we deduce that diversification within the studied species started ca. 130-96 kya and hypothesise that climatic changes during the LGP were a driving force behind the current distribution of feather grasses. Importantly, here we also emphasise the usefulness of applying integrative approaches combining molecular and morphological data to delimitate species and detect hybridisation and introgression events in feather grasses. Finally, we conclude that Stipa may be a suitable genus to study these phenomena. Methods Plant material In total, 302 fully developed Stipa samples were used for molecular and morphological studies. We gathered individuals representing the section Leiostipa (S. baicalensis, S. capillata, S. grandis and S. krylovii) from localities where these taxa grow together as well as from areas where they grow separately from each other (Fig. 1, Supplementary Table SI). Additionally, we included two populations of S. baicalensis previously reported from the northeastern part of Kazakhstan [46] and 13 specimens of S. glareosa belonging to the section Smirno-via that were found in localities 10, 12, 13 and 16. All voucher specimens used in the study are preserved at TK and KRA. All maps were visualised using ArcGIS Pro 2.7.1 (ESRI, Redlands, USA). The species distribution ranges were established based on the revision of herbarium specimens preserved at AA, ALTB, B, BM, BRNU, COLO, E, FR, FRU, G, GAT, GFW, GOET, IFP, K, KAS, KFTA, KHOR, KRA, KRAM, KUZ, L, LE, LECB, M, MO, MSB, MW, NY, P, PE, PR, PRC, TAD, TASH, TK, UPS, W, WA, WU and Z. DNA extraction, amplification and DArT sequencing This section was performed according to the previously reported procedures [34]. In brief, genomic DNA was isolated from dried leaf tissues using a Genomic Mini AX Plant Kit (A&A Biotechnology, Poland) and sent to Diversity Arrays Technology Pty Ltd. (Canberra, Australia) for the following genome complexity reduction using restriction enzymes and high-throughput polymorphism detection [66]. All DNA samples were processed in digestion/ligation reactions as described previously [66], but replacing a single Pstl-compatible adaptor with two different adaptors corresponding to two different restriction enzyme overhangs. The Pstl-compatible adapter was designed to include Illumina flowcell attachment sequence, sequencing primer sequence and "staggered” varying length barcode region, similar to the sequence previously reported [37], The reverse adapter contained a flowcell attachment region and Msel-compatible overhang sequence. Only “mixed fragments" (Pstl-Msel) were effectively amplified by PCR using an initial denaturation step of 94°C for 1 min, followed by 30 cycles with the following temperature profile: denaturation at 94 °C for 20 s, annealing at 58 °C for 30 s and extension at 72 °C for 45 s, with an additional final extension at 72 °C for 7 min. After PCR equimolar amounts of amplification products from each sample of the 96-well microtiter plate were bulked and applied to c-Bot (Illumina, USA) bridge PCR followed by sequencing on Hiseq2500 (Illumina, USA). The single read sequencing was performed for 77 cycles. Sequences generated from each lane were processed using proprietary DArT analytical pipelines. In the primary pipeline, the fastq files were first processed to filter away poor quality sequences, applying more stringent selection criteria to the barcode region compared to the rest of the sequence. In that way the assignments of the sequences to specific samples carried in the “barcode split” step were reliable. Approximately 2.5 min sequences per barcode/sample were identified and used in marker calling. DArTseq data analysis For the downstream analyses, we applied co-dominant single nucleotide polymorphism (SNP) markers processed in the R-package dartR v.1.5.5 [67] with the following parameters: (1) a scoring reproducibility of 100%, (2) SNP loci with read depth <5 or >50 were removed, (3) at least 95% loci called (the respective DNA fragment had been identified in greater than 95% of all individuals), (4) monomorphic loci were removed, (5) SNPs that shared secondaries (had more than one sequence tag represented in the dataset) were randomly filtered out to keep only one random sequence tag. DNA-based species delimitation Five approaches were used to analyse the genetic structure at the species level: (1) Unweighted Pair Group Method with Arithmetic Mean (UPGMA), (2) fastSTRU CTURE analysis, (3) Principal Coordinates Analysis 79 Baiakhmetovetal. BMC Plant Biology (2021)21:505 Page 14 of 20 (PCoA), (4) NewHybrids analysis, (5) calculation of the f4 statistic. In addition, to assess the genetic differentiation at the population level within S. baicalensis, S. capillata, S. grandis and S. krylovii we performed PCoA and STRUCTURE analyses and calculated FST. Furthermore, we used SNAPP to estimate divergence times within the studied species and populations. Firstly, a UPGMA cluster analysis based on Jaccard's distance matrix was performed using R-packages dartR and visualised with stats v.3.6.2 [68]. Next, the genetic structure was investigated using fastSTRUCTURE v.1.0, which implements the Bayesian clustering algorithm STRUCTURE, assuming Hardy-Weinberg equilibrium between alleles, in a fast and resource-efficient manner [69], A number of clusters (K-values) ranging from 1 to 10 were tested using the default parameters with ten replicate runs per dataset. The most likely K-value was estimated with the best choice function implemented in fastSTRUCTURE. The output matrices for the best K-values were reordered and plotted using the R package pophelperShiny v.2.1.0 [70], We applied the threshold of 0.100.9 being pure species and 0.450.8 was set for the assignment into a genetic category. The calculated posterior probabilities for the assigned categories were visualised using the R package pophelperShiny. Testing for introgression Further, to calculate the/] statistic we retained only pure individuals determined via the UPGMA, fastSTRUCTU RE and PCoA analyses. All filtering steps were conducted using the R-package dartR with the above-mentioned sequence. Next, the processed data was converted into the EIGENSTRAT format using the R-package dartR. The subsequent calculation off4 statistics was performed in ADM1XTOOLS v. 7.0.1 [77] using the R package admixr v.0.9.1 [78], In brief, the/] statistic [79] is similar to the D statistic [80, 81] and measures the average correlation in allele frequency differences between four populations, e.g., A, B, C and D [82], If a divergent outgroup is provided as population A, we can test for gene flow between B and C (if the statistic is negative and Z-score< —3) or B and D (if the statistic is positive and Z-score>3). Due to such a test originally being applied to continuous genome-wide data, it is important to mention some limitations of the analysis while working with non-model organisms. If a draft genome is available, it is possible to specify positions of SNPs along contigs or scaffolds by a special parameter “blockname” implemented in ADMIXTOOLS. Nonetheless, the package currently does not support more than 600 contigs/scaf-folds [83] decreasing the potential number of used SNPs for very fragmented genomes. That was the case of Stipa, where only one draft genome comprising 5931 contigs is currently available [48]. Additionally, SNPs positions and genetic distances are used for a block jackknife method to test for a significant deviation from the null expectation of the/] statistic. Thus, in the absence of a reference genome or the presence of a very fragmented genome the fJD statistics should be treated with caution. Here we consider signatures of past gene flow events only if BABA or ABBA patterns are greater than 50. All possible four-species combinations were tested, while one species, S. glareosa, was selected as an outgroup (A) for all runs. Population differentiation To perform PCoA and STRUCTURE analyses and calculate ^ST at the population level we retained only pure 80 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 15 of 20 species and kept populations with more than three individuals per population. To maintain as many populations as possible, we merged several individuals growing relatively close to each other (Table 2 and Supplementary Table SI). All filtering steps were conducted using the R-package dartR with the above-mentioned sequence. PCoA analyses were also performed according to the aforesaid flow. Next, we used STRUCTURE v.2.3.4 [84] instead of fastSTRUCTURE due to the former software being markedly superior to the latter one under weak population differentiation [85]. To overcome the relatively slow speed of the analysis we applied parallel computing using StrAuto v.1.0 [86], Five replicate runs were performed for each number of clusters (K) from one to ten with a burn-in of 50,000 iterations followed by 500,000 MCMC iterations. The optimal K value was identified based on Evanno’s method of AK statistics [87] as implemented in Structure Harvester [88], The calculated Table 2 Analysed populations Species Populations and their localities Number of according to Fig. 1 and individuals per Supplementary Table S1 population S. baicalensis Population 1 (localities 5,6 and 7) 5 Population 2 (locality 11) 8 Population 3 (locality 13) 15 Population 4 (locality 16) 8 S. capillata Population 1 (locality 1) 3 Population 2 (locality 3) 7 Population 3 (locality 4) 7 Population 4 (locality 5) 6 Population 5 (locality 8) 19 Population 6 (locality 9) 9 Population 7 (locality 19) 6 Population 8 (locality 21) 5 Population 9 (localities 25,29 and 30) 4 5. grandis Population 1 (locality 5) 3 Population 2 (locality 11) 7 Population 3 (locality 14) 7 Population 4 (locality 16) 6 Population 5 (localities 15 and 17) 8 Population 6 (locality 18) 8 Population 7 (locality 23) 7 Population 8 (locality 24) 7 S. krylovii Population 1 (locality 11) 15 Population 2 (locality 15) 7 Population 3 (locality 17) 6 Population 4 (locality 18) 8 Population 5 (locality 20) 7 Population 6 (locality 21) 10 Population 7 (locality 24) 6 Population 8 (localities 26,27 and 28) 20 posterior probabilities for the assigned categories were visualised using the R package pophelperShiny. Then, the fixation index Fst was calculated using the R-package dartR with 1000 bootstraps to obtain /7-values. This measure assesses genetic differentiation among populations, where values of 0.00-0.05 indicate low differentiation, 0.05-0.15 indicate moderate differentiation, while Fst>0.15 indicates high levels of differentiation [89], Divergence-time estimation Finally, to estimate divergence times between the studied taxa, we retained only one pure individual per species and population using dartR and followed the abovementioned sequence of filtering steps with an exception of the called loci (100% instead of 95%). Then, the SNP data was converted to the Nexus format using dartR. Subsequently, we created an XML file using the SNAPP-specific template provided in Stange et al., 2018. SNAPP v. 1.5.1 [90] utilises the multi-species coalescent approach and is well-suited for analyses of genome-wide data deducing the species tree and divergence times directly from SNPs [91]. We applied one time calibration, setting a log-normal distribution with a mean of 4.39 Mya and a standard deviation (SD) of 0.18 for the split between S. glareosa (section Smirnovia) and S. capillata (section Leiostipa) as it was inferred in our previous study [48]. The analysis was performed three times independently, 1.25 million MCMC generations for each run using BEAST2 v.2.6.3 [92]. Tracer v. 1.7.1 [93] was used to visually check the combined log file regarding Effective Sample Size (ESS) values. As all ESSs exceeded 200, we combined tree files using LogCombiner v.2.6.3 (a part of the BEAST package) with the first 10% discarded as burn-in from each run. The final maximum clade credibility tree was summarised in TreeAnnotator v.2.6.3 (a part of the BEAST package). Lately, we visualised a pattern across all of the posterior trees via DensiTree v.2.01 [94], while FigTree v.1.4.4 [95] was used to inspect the Bayesian posterior probabilities and the 95% credibility intervals of the final tree. Morphological analysis A total of 302 specimens were examined under a light microscope SMZ800 (Nikon, Japan) across the 17 most informative quantitative and six qualitative morphological characters commonly used in keys and taxonomic descriptions of Stipa (Table 3). Firstly, the Shapiro-Wilk test was used in the R-package MVN v.5.8 [96] to assess the normality of the distribution of each characteristic. Secondly, the non-parametric Spearman’s correlation test was applied using R-packages stats and Hmisc v.4.3-1 [97] to examine relations between the studied characters. 81 Baiakhmetovefa/. BMC Plant Biology (2021)21:505 Page16of20 Table 3 Morphological characters used In the present study Character Abbreviation Quantitative characters (mm) Width of blades of vegetative shoots WVS Length of ligules of the middle cauline leaves LigC Length of ligules of the internal vegetative shoots LiglV Length of the lower glume LG Length of the anthecium AL Length of the callus CL Length of the callus base CBL Length of hairs on the dorsal line on the lemma LHD Length of hairs on the ventral line on the lemma LHV Distance from the end of the dorsal line of hairs to the top of the lemma DDL Distance from the end of the ventral line of hairs to the top of the lemma DVL Length of hairs on the top of the lemma LHTA Length of the lower segment of the awn Col 1L Length of the middle segment of the awn Col2L Length of the seta SL Length of hairs on the lower segment of the awn HLCol 1 Length of hairs on the middle segment of the awn HLCol2 Qualitative characters Character of the abaxial surface of vegetative leaves (glabrous, with prickles) AbSVL Character of the adaxial surface of vegetative leaves (short hairs, long hairs, mixed) AdSVL Type of the awn geniculation (single, double) AG Character of nodes (glabrous, with hairs) CN Type of hairs on the top of the anthecium (glabrous, poor developed, well developed) HTTA Presence of hairs below nodes (glabrous, with hairs) PHBN The combined correlogram with the significance test was visualised using the R-package corrplot v.0.84 [98]. Next, a Factor Analysis of Mixed Data (FAMD) [99] was accomplished using the R-package FactoMineR v.2.3 [100] to characterise the variation within and among groups of taxa without a priori taxonomic classification and to extract the variables that best identified them. The number of principal components used in the analysis was chosen based on Cattell’s scree test [101]. R-packages factoextra v. 1.0.6 [102] and plotly were used to visualise the first two and the first three principal components, respectively. Subsequently, the plots were supplemented with the result of the fastSTRUCTURE analysis for the best K = 5. Additionally, to evaluate distributional relationships between each response variable and the studied taxa, notch plots and interactive box plots were created using R-packages ggplot2 v.3.3.0 and plotly v.4.9.2, respectively. The notched box plots display a confidence interval around the median, which is normally based on the median ± 1.57 x interquartile range/square root of n. According to this graphical method for data analysis, if the notches of the two boxes do not overlap, there is “strong evidence" (95% confidence) that their medians differ. In addition, to reveal significant differences between means of particular characters across all examined taxa the nonparametric Kruskal-Wallis test followed by the post-hoc Wilcoxon test with Bonferroni correction were performed using the R-package stats v.3.6.2. Supplementary Information The online version contains supplementary material available at https://doi org/10.1186/s 12870-021-03287-w. Additional file 1. Interactive box plots. Additional file 2: Supplementary Table SI List of samples used In the study. Table S2 Average posterior probabilities inferred in NewHybrids for first (FI) and second (F2) generation hybrids and backcrosses(F1xPt and F1 xP2). Table S3 Pairwise fst values for population differentiation across the four studied species. Fst >0.15 indicating high levels of differentiation are in bold type Table S4 Contribution (%) by dimension of each character (abbreviations according to Table 3) in FAMD. The first five characters contributing the most are in bold type. Abbreviations of the qualitative variables and their contributions to the principal axes are underlined. Table S5. The assigned species names based on morphological and molecular data. Mismatches are shown in bold type Supplementary Figure SI Venn diagram representing polymorphic SNPs among four pure Stipa species. The admixed individuals and S. glareosa, which did not show patterns of hybridisation, were omitted in the metric's calculation. 82 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 17 of 20 Supplementary Figure S2 Delta K values calculated by Evanno's method across four species, (a) S. baicolensis. (b) S. capillata. (c) S. grandis. (d) S. krylovii. Supplementary Figure S3 Phylogeny (at the top) and divergence date estimates at the species level (on the bottom) inferred by SNAPP.The scale shows divergence time in Mya. The red circles indicate nodes with the Bayesian posterior probabilities (BPP) > 0.8. The lower-case letters refer to the embedded table containing data regarding the exact estimates of the divergence times (in kya), BPPs and 95% HPD intervals. Supplementary Figure S4 Correlation matrix of the studied morphological characters (abbreviations according to Table 3). Colour intensity and the size of the circle are proportional to the correlation coefficients (displayed in the circle). Positive correlations are blue while negative are red. All p-values of Pearson correlations were <0.01. Supplementary Figure SS Factor analysis of mixed data performed on 17 quantitative and six qualitative characters of the five examined species of Stipo. (a) Plot of the principal axes one and two. (b) Plot of the principal axes one and three, (c) Plot of the principal axes one and four, (d) Plot of the principal axes two and three. (e) Plot of the principal axes two and four, (f) Plot of the principal axes three and four. The pie charts represent the proportions of membership established by fastSTRUCTURE for the best K=5. Supplementary Figure S6 Notched boxplot demonstrating the mean (white circle), the median (dark black line), 95% confidence interval around the median (notch), inter-quartile ranges (25 to 75%), whiskers (5 and 95%) and minimum and maximum measurements (crosses) of quantitative characters (a-q) for the studied species. Statistical significance was tested by Wilcoxon rank-sum test for post hoc group comparisons with 8onferroni correction, p < 0.001, p < 0.01, p < 0.05, p<0.tandp<1 noted as'"“; '•*; 'and no symbol, respectively. Due to the small sample size, p-values cannot be properly estimated for S. grandis x S. krylovii and S. grandis x S. baicolensis. Each dot represents an observation. Supplementary Figure S7 Bar charts displaying frequencies of the qualitative characters, (a) AbSVL (b) AdSVL (c) HTTA (d) AG. (e) CN. (f) PHBN. Acknowledgements We would like to express our gratitude to two anonymous reviewers for providing valuable comments on the manuscript. Authors' contributions E.B., M N, RD.G. supervised the study. PD.G, M.N. planned the field studies, carried out the sampling, revised herbarium materials and performed the taxonomic identification. RD.G, M.N. established the general distribution of the species and D.R. created the map. DR, EB, RD.G. conducted the morphological measurements. EB. performed all the analyses and wrote the manuscript. All authors revised the draft, provided comments and approved the final version of the manuscript. Funding The study was supported by a RSF grant (project no. 19-74-10067), partially by a DS grant of the Jagiellonian University (446.31150.2.2020) and by the National Science Centre, Poland (grant no. 2018/29/B/NZ9/00313).Theopen-access publication of this article was funded by the BioS Priority Research Area under the program 'Excellence Initiative - Research University' at the Jagiellonian University In Krakow. Availability of data and materials The SNP dataset derived from the DArTseq pipeline In the genlight format is available via Figshare repository, https://doi.org/10.6084/m9.figshare.l4461802. Declarations Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Competing interests The authors declare that they have no competing interests. Author details 1 Institute of Botany, Faculty of Biology, Jagiellonian University, Gronostajowa 3,30-387 Kraków, Poland. Research laboratory'Herbarium', National Research Tomsk State University, Lenin 36 Ave.. 634050Tomsk, Russia. ’Department of Biology, Altai State University, Lenin 61 Ave, 656049 Barnaul, Russia. Received: 23 April 2021 Accepted: 20 October 2021 Published online: 01 November 2021 References 1. Mora C, Tittensor DP, Adi S, Simpson AG, Worm B. How many species are there on earth and in the ocean? PLoS Biol 2011 ;9(8):e1001127. https://doi.org/10.1371 /journal.pbio.1001127. 2. Christenhusz MJM, Byng JW. The number of known plants species in the world and its annual increase. Phytotaxa. 2016:261 (3):201 —217. https://doi.Org/10.11646/phytotaxa.261.3.1 3. Dayrat B. Towards integrative taxonomy. Biol J Linn Soc 2005:83:407415. https://doi.Org/10.111 l/j.1095-8312.2005.00503.x. 4. Raupach MJ, Amann R, Wheeler 0, Roos C. The application of "-omics" technologies for the classification and identification of animals. Org Divers Evol 2016:16:1 -12. https://doi.Org/10.1007/sl 3127-015-0234-6. 5. Massd S, Ldpez-Pujol J, Vilatersana R. Reinterpretation of an endangered taxon based on integrative taxonomy: the case of Cynara boetica (Com-positae). PLoS One 2018;13(11):e0207094. https://doi.org/10.1371/journ al.pone.0207094. 6. Lovreniic L, Bonassin L, BoitjaniiO LL, Podnar M, Jelic M, Klobuiar G, et al. New insights into the genetic diversity of the stone crayfish: taxonomic and conservation implications. BMC Evol Biol 2020:20:146. https://doi.Org/l0.1186/si 2862e-020-01709-1. 7. Hajibabaei M, Singer GAC, Hebert PDN, Hickey DA. DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics.Trends Genet 2007;23(4): 167—172. https://doi.org/10. 10l6/j,tig.2007.02.001. 8. Kress WJ. Plant DNA barcodes: applications today and in the future. J Syst Evol 2017;55(44):291-307. https://doi.Org/10.111l/jse.12254. 9. Tizard J, Patel S, Waugh J, Tavares E, Bergmann T, Gill B. et al. DNA bar-coding a unique avifauna: an important tool for evolution, systematlcs and conservation. BMC Evol Biol 2019:19:52. httpsWdoi.org/IO.1186/ si 2862-019-1346-y. 10. CBOL Plant Working Group. A DNA barcode for land plants. P Natl Acad Sci USA 2009:106(31 >:12794-12797. https://doi.org/10.1073/pnas.09058 45106. 11. Li DZ. Gao LM, LI HT, Wang H, Ge XJ, Liu JO, et al. Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants. P Natl Acad Sci USA. 2011; 108(49): 19641—19646. https://doi.Org/10.1073/pnas.11045 51108. 12. Nobis M. Taxonomic revision of the central Asiatic Slipa tianschanica complex (Poaceae) with particular reference to the epidermal micromorphology of the lemma. Folia Geobot 2014;49:283-308. httpsZ/doi. org/10.1007/s 12224-013-9164-2. 13. Nobis M, Gudkova P, Nowak A, Sawickl J, Nobis A A revision of the genus Slipa (Poaceae) in middle Asia, Including a key to species identification, an annotated checklist and phytogeographlc analysis. Ann Mo Bot Gard 2020;105:1-63. https://dol.org/103417/2019378. 14. Hamasha HR, von Hagen KB, Rdser M. Slipa (Poaceae) and allies in the Old World: molecular phylogenetics realigns genus circumscription and gives evidence on the origin of American and Australian lineages. Plant Syst Evol 2012:298:351-367. https://doi.org/10.1007/ S00606-011-0549-5. 15. Romaschenko K, Peterson PM. Soreng RJ, Garcia-Jacas N, Futorna 0, Susanna A Systematics and evolution of the needle grasses (Poaceae: Pooideae: Stlpeae) based on analysis of multiple chloroplast loci, ITS, and lemma micromorphology. Taxon. 201261:18-44. https://doi.org/ 10.!002/tax.611002. 16. Tzvelev NN. Zlaki USSR Leningrad: Nauka Press; 1976. 17. Tzvelev NN. Notes on the tribe Stipeae Dumort. (Poaceae). Novosti Slst VysshRast. 2012;43:20-9. 83 Baiakhmetov et al. BMC Plant Biology (2021)21:505 18. Freitag H.The genus Stipa (Gramineae) in southwest and South Asia. 38. Notes Roy Bot Gard Edinburgh. 1985;42:355-489. 19. Krawczyk K, Nobis M, Nowak A, Szczecińska M, Sawicki J. Phylogenetic implications of nuclear rRNA IGS variation in Stipa L (Poaceae). Sci Rep. 39. 2017;7:1 1506. https://doi.org/10.1038/s41598-0l7-11804-x. 20. Krawczyk K, Nobis M, Myszczyński K, Klichowska E, Sawicki J. Plastld superbarcodes as a tool for species discrimination in feather grasses (Poaceae: Stipa). Sci Rep 2018,8:1924. https://doi.org/10.1038/ 40. S41598-018-20399-w. 21. Smirnov PA Stiparum Armeniae minus cognltarum descriptiones. Byull MoskovskObshch Isp Prir Otd Biol. 1970;75:113-5. 22. Kotukhov YA. Synopsis of feather grass (Stipa L.) and false needlegrasses 41. (Ptilagrostis Griseb.) the eastern of Kazakhstan (the Kazakh Altai, Zaisan valley and Prialtayskie ranges). Bot Issl Sib Kazakhst 2002:83-16. 23. Nobis M. Taxonomic revision of the Stipa Upskyi group (Poaceae: Stipa section Smirnovia) in the Pamir alai and tian-Shan Mountains. Plant Syst Evol 2013;299:1307-1354. httpsZ/dol.org/10.1007/s00606-013-0799-5. 42. 24. Mallet J. Hybridization as an invasion of the genome. Trends Ecol Evol 2005;20:229-237. https://doi.Org/10.l016/j.tree.2005.02.010. 25. Hegarty MJ, Hiscock SJ. Hybrid speciation in plants: new insights from molecular studies. New Phytol 2005;165:411-423. https://doi.org/10. 43. 1111 /j.1469-8137.2004.01253.x. 26. Abbott R, Albach D, Ansell S, Arntzen JW, Baird SJ, Bierne N, et al. Hybridization and speciation. J Evol Biol 2013:26:229-246. httpsZ/doi. 44 org/10.1111/j. 1420-9101.2012,02599.x. 27. MatsuokaY.Takumi S, Nasuda S. Genetic mechanisms of allopolyploid 45. speciation through hybrid genome doubling: novel Insights from wheat (Triticum and Aegilops) studies. Int Rev Cel Mol Bio 2014;309:199-258. 46. httpsZ/doi.org/10.1016/b978-0-12-800255-1.00004-1. 28. Gordon SP, Contreras-Moreira B, Levy JJ, Djamel A Czedik-Eysenberg 47. A,Tartaglio VS, et al. Gradual polyploid genome evolution revealed by pan-genomlc analysis of Brachypodium hybridum and its diploid progenitors. NatCommun 2020;11:3670. httpsZ/doi.org/10.1038/ 48. s41467-020-17302-5. 29. Rieseberg LH, Ellstrand NC. What can molecular and morpho- logical markers tell us about plant hybridization. Crit Rev Plant Sci 1993;12:213-241. httpsZ/doi.org/10.1080/07352689309701902. 49. 30. Hardig TM, Brunsfeld SJ, Fritz RS, Morgan M, Orians CM. Morphological and molecular evidence for hybridization and Introgression in a willow (Salix) hybrid zone. Mol Ecol 2000;9:9-24. httpsZ/doi.org/10.1046/j. 1365-294x2000.007S7x 50. 31. Rieseberg L Archer M, Wayne R. Transgressive segregation, adaptation and speciation. Heredity. 1999;83:363-372. httpsZ/doi.org/10.1038/sj. hdy.6886170. 32. Nobis M, Nowak A, Nobis A, Nowak S, Zabicka J, Zabicki P. Stipa xfallax (Poaceae: Pooideae: Stipeae), a new natural hybrid from Tajikistan, and a new combination in Stipa drobovll. Phytotaxa. 2017;303:141-154. 51. httpsZ/doi.org/10.11646/phytotaxa.303.2.4 33. Nobis M, Gudkova PD, Baiakhmetov E, Zabicka J, Krawczyk K, Sawicki J. Hybridisation, introgression events and cryptic speciation In Stipa 52. (Poaceae): a case study of the Stipa heplapotamica hybrid-complex. Perspect Plant Ecol Evol Syst 2019;39:!25457. httpsZ/dol org/10 1016/j ppees.2019.05.001. 53. 34. Baiakhmetov E, Nowak A, Gudkova PD, Nobis M. Morphological and genome-wide evidence for natural hybridisation within the genus 54. Stipa (Poaceae). Sci Rep 2020:10:13803. httpsZ/doi.org/10.1038/ s41598-020-70582-1. 35. Nie B, Jiao BH, Ren LF, Gudkova PD, Chen WL, Zhang WH. Integrative 55. taxonomy recognized a new cryptic species within Stipa grandis from loess plateau of China. J Syst Evol 2020. httpsZ/doi.org/10.11 U/jse. 12714. 36. Baird NA Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, et al. 56. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One 20083:e3376. httpsZ/dx.dol.org/10.1371%2Fjournal. pone.0003376 57. 37. Elshire RJ, Glaubltz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 2011;6:el9379. httpsZ/doi.org/10.1371/ 58. journal.pone.0019379. Page 18 of 20 Poland JA, RlfeTW. Genotyping-by-sequencing for plant breeding and genetics. Plant Genome 2012:5:92-102. httpsZ/doi.org/103835/plant genome2012.05.0005. Schilling MP, Gompert Z, Li FW, Windham MD, Wolf PG. Admixture, evolution, and variation in reproductive isolation in the Boechera puberula dade. BMC Evol Biol 20l8;18:61.httpsZ/doi.org/l0.1186/ si 2862-018-1173-6. Wagner ND, Gramlich S, Horandl E. RAD sequencing resolved phylogenetic relationships in European shrub willows (Salix L subg. Chamaetia and subg. Verrix) and revealed multiple evolution of dwarf shrubs. Ecol Evol 2018;8( 16):8243-8255. httpsZ/doi.org/10.1002/ece3.4360. Hodklnson TR. Perdereau A, Klaas M, Cormican P, Barth S. Genotyping by sequencing and plastome analysis finds high genetic variability and geographical structure in Dactylls glomerata L in Northwest Europe despite lack of ploidy variation Agronomy 2019,9:342. httpsZ/doi.org/ 103390/agronomy9070342. Gudkova PD, Olonova MV, Feoktisov DS.The comparison of ecologo-dimatic niches of two species feather grass Stipa sareptana a.K. Becker and S. krylovii Roshev. (Poaceae). Ukr J Ecol 2017;7:263-269. httpsZ/doi. org/10.15421/2017 115 Tzvelev NN. Gramineae. In: Grubov VI, editor. Plantae Asiae Centralis (secus materies Instituti Botanici nomine V. L Komarovli). Leningrad; Nauka; 1968. p. 1-243. Wu ZL, Phillips SM. Slipa. In: Wu ZY, et al., editors. Flora of China. Beijing: Science Press; 2006. p. 196-203. Tzvelev NN, Probatova NS. Grasses of Russia. Moscow: KMK Scientific Press; 2019. Kotukhov YA, Anufrieva OA. Addition to the flora of Kazakhstan. Terra. 2008:48-55. Curtu AL, Gaillng O, Finkeldey R. Evidence for hybridization and introgression within a species-rich oak (Quercus spp.) community. BMC Evol Biol 2007;7318. httpsZ/doi.org/10.1186/1471-2148-7-218. Baiakhmetov E. Guyomar C, Shelest E, Nobis M, Gudkova PD. The first draft genome of feather grasses using SMRT sequencing and its Implications In molecular studies of Stipa. Sci Rep202l;11:15345. httpsZ/doi. org/10.1038/S41598-021 -94068-w. Durvasula A, Fulgione A, Gutaker RM, Alacakaptan SI, Flood PJ, Neto C, et al. (2017). African genomes illuminate the early history and transition to selfing in Arabidopsis lhaliana. P Natl Acad Sci USA. 2017;114(20)3213-5218. httpsZ/doi.org/10.1073/pnas.l616736114. Cornejo-Romero A, Vargas-Mendoza CF, Agullar-Martinez GF, Medina-S^nchez J, Rendon-Aguilar B, Valverde PL et al. Alternative glacial-Interglaclal refugia demographic hypotheses tested on Cephalocereus eolumna-trajani (Cactaceae) in the intertropical Mexican drylands. PLoS One 2017;12(4):e0175905. httpsZ/doi.org/10.1371/journal.pone.01759 05. Long C, Kubatko L The effect of gene flow on coalescent-based species-tree Inference. Syst Biol 2018;67(S):770—785. httpsZ/doi.org/10. 1093/sysbio/syy020. Jaio X, Flouri T, Rannala B, Yang Z. The Impact of cross-species gene flow on species tree estimation. Syst Biol 2020;69(5):830-847. httpsZ/doi. org/10.1093/sysbio/syaa001. Jiao X, Yang Z. Defining species when there is gene flow. Syst Biol 2021;70(1): 108-119. httpsZ/doi.org/10.1093/sysbio/syaa052. Rieseberg L Wendel J. Introgression and Its Consequences In Plants. In: Harrison RG, editor. Hybrid Zones and the Evolutionary Process. Oxford: Oxford University Press; 1993. p. 70-109. Fogelqvist J, Verkhozina AV, Katyshev Al, Pucholt P, Dixelius C, Ronnberg-Wastljung AC, et al. Genetic and morphological evidence for introgression between three species of willows. BMC Evol Biol 2015;15:193. httpsZ/doi.org/10.1186/sl 2862-015-0461-7. Federman S, Donoghue MJ, Daly DC, Eaton DAR. Reconciling species diversity in a tropical plant clade (Canarium. Burseraceae). PLoS One 2018;13(6):e019888Z httpsZ/doi.org/10.137 l/journal.pone.0198882. Buck R, Hyasat S, Hossfeld A, Flores-Renteria L. Patterns of hybridization and cryptic introgression among one- and four-needled plnyon pines. Ann Bot 2020;126(3):401-411. httpsZ/doi.org/10.1093/aob/mcaa045. Civan R Brown TA. Role of genetic introgression during the evolution of cultivated rice (Oryzasotiva L). BMC Evol Biol. 2018:18:57. httpsZ/doi. org/10.1186/S12862-018-1180-7. 84 Baiakhmetov etal. BMC Plant Biology (2021)21:505 59. Cheng H, Uu J, Wen J, Nie X, Xu L, Chen N, et al. Frequent intra- and 79. inter-species introgression shapes the landscape of genetic variation in bread wheat. Genome Biol 2019:20:136. httpsZ/dol.org/10.1186/ Sl 3059-019-1744-x. 80. 60. Pachakkil B, Terajima Y. Ohmido N. Ebina M, Irei S, Hayashi H. et al. Cytogenetic and agronomic characterization of intergeneric hybrids between Saccharum spp hybrid and Erianthus arundinoceus Sci Rep 81. 2019;9:1748. https://doi.org/10.1038/s41598-018-38316-6. 61. Harrison RG, Larson EL. Hybridization, introgression, and the nature of species boundaries, J Hered 2014;105(11:795-809. https://doi.arg/10. 82. 1093/jhered/esu033. 62. Yang YQ, Li X, Kong X, Ma L Hu X, Yang Y. Transcriptome analysis reveals diversified adaptation of Stipa purpurea along a drought gradient on 83. the Tibetan plateau. Funct Integr Genomic 2015;15295-307. https// doi.org/10.1007/si 0142-014-0419-7. 63. Wan D, Wan Y, Hou X, Ren W, Ding Y, Sa R. De novo assembly and tran- scriptomic profiling of the grazing response in Stipa grandis. PLoS One 84. 2015,10:e0122641. https://doi.org/10.1371 /journal.pone.O122641. 64. Schubert M, Marcussen T, Meseguer AS, Fjellheim S. The grass subfamily 85. Pooideae: cretaceous-Palaeocene origin and climate-driven Cenozoic diversification. Glob Ecol Biogeogr 2019:28:1168-1182. https://doi.org/ I0.11!1/geb.l2923. 86. 65. Yan D, Ren J, Liu J, Ding Y, Niu J. De novo assembly, annotation, marker discovery, and genetic diversity of the Stipa breviflora Griseb. (Poaceae) response to grazing. PLOS One. 2020; 15( 12): e0244222. https://doi.org/ 87. 10.1371 /journal.pone.0244222 66. Kllian A. Wenzl R Huttner E, Carling J, Xia L, Blois H, et al. Diversity arrays technology: a generic genome profiling technology on open platforms. Methods Mol Biol 2012388:67-89. https://doi.org/10.1007/ 88. 978-1-61779-870-2_5. 67. Gruber B, Unmack PJ, Berry OF, Georges A dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing Mol Ecol Resour 2018;18:691-699. https://dol.org/ 89. 10.1111/1755-0998.12745. 68. R Core Team. R: A language and environment for statistical computing. 90. R Foundation for Statistical Computing. 2021. httpsZ/www.R-project. org. Accessed 23 Apr 2021. 69. Raj A, Stephens M, Pritchard JK. fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics. 2014:197:573- 91. 589. httpsZ/doi.org/10.1534/genetics.l 14.164350. 70. Francis RM. POPHELPER: an R package and web app to analyse and visualize population structure. Mol Ecol Resour 2017;17(1 ):27-32. https://doi.Org/l0.1111/1755-0998.12509. 71. Burgarella C, Lorenzo Z, Jabbour-Zahab R, Lumaret R. Guichoux E, Petit 92. RJ. et al. Detection of hybrids In nature: application to oaks (Quercus suber and Q. ilex). Heredity. 2009;102:442-452. https://doi.org/10.1038/ hdy.2009.8. 72. Winkler KA, Pamminger-Lahnsteiner B, Wanzenbock J, Weiss S. Hybridi- 93. zatlon and restricted gene flow between native and introduced stocks of Alpine whitefish (Coregonus sp.) across multiple environments. Mol Ecol. 2011 ;20:456-472. httpsZ/doi.org/10.1111/J.1365-294X.2010. 94. 04961X 73. Beugin MR GayetT, Pontler D, Devillard S, JombartT. A fast likelihood solution to the genetic clustering problem. Methods Ecol Evol 2018,-9:1006-1016. 95. https://doi.Org/10.l 111/2041-21 OX. 12968. 74. Wickham H. ggplot2: elegant graphics for data analysis. New York: 96. Springer; 2016. 75. Sievert C. Parmer C, Hocking T, Chamberlain S, Ram K, Corvellec M, et al. plotly: create interactive web graphics via plotly.js) 2021. 97. https://rdrr.io/cran/plotly. Accessed 23 Apr 2021. 76. Anderson EC, Thompson EA. A model-based method for ¡den- 98. tifying species hybrids using multilocus genetic data. Genetics. 2002;160(3): 1217—29. 77. Patterson N, Moorjani P, Luo Y, MallickS, Rohland N, ZhanY, et al. 99. Ancient admixture in human history. Genetics. 2012; 192(3): 10651093. https://doi.org/l0.1534/genetlcs.l12.145037. 100. 78. Petr M, Vernot B. Kelso J. Admlxr-R package for reproducible analyses using ADMIXTOOLS. Bioinformatics. 2019;35(17):3194-3195. https// 101. doi.org/10.1093/bioinformatlcs/btz030. Page 19 of 20 Reich D.Thangaraj K, Patterson N, Price AL, Singh L. Reconstructing Indian population history. Nature. 2009;461:489-494. https://doi.org/ 10.1038/nature08365. Green RE, Krause J, Briggs AW, MaricicT, Stenzel U, Kircher M, et al. A draft sequence of the Neandertal genome. Science. 2010;328:710-722. httpsZ/doi.org/10.1126/sclence. 1188021. Durand EY, Patterson N, Reich D, Slatkln M. Testing for ancient admixture between closely related populations. Mol Biol Evol 2011,-28:2239-2252. https://doi.org/10.1093/molbev/msr048. Upson M. Applying f-4-statistics and admixture graphs: theory and examples. Mol Ecol Resour 2020;20(6): 1658-1667. httpsZ/doi.org/10. 1111/1755-0998.13230. Taylor RS, Manseau M, Horn RL, Keobouasone S, Golding GB, Wlson PJ.The role of introgression and ecotypic parallelism In delineating intraspecific conservation units. Mol Ecol 2020:29(15):2793-2909. https://doi.Org/10.1111/mec. 15522. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155(2):945-59. Stift M, Kolir F, Melrmans PG. STRUCTURE is more robust than other clustering methods in simulated mixed-ploidy populations. Heredity. 2019,123:429-441. https//doi.org/10.1038/S41437-019-0247-6. Chhatre VE, Emerson KJ. StrAuto: automation and parallelization of STRUCTURE analysis. BMC Bioinformatics 2017:18:192. https//doi. org/10.1186/S12859-017-1593-0. Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 2005; 14(8):2611-2620. httpsZ/doi.org/10.1111/j.1365-294X.2005. 02553.x. Earl DA, vonHoldt BM. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour 2012;4(2):359—361. https:// doi.org/10.1007/s 12686-011 -9548-7. Hartl DL, Clark AG. Principles of population genetics. 3rd ed. Sunderland: Sinauer Associates; 1997, Bryant D, Bouckaert R, Felsenstein J, Rosenberg NA, RoyChoudhury A. Inferring species trees directly from Biallelic genetic markers: bypassing gene trees in a full coalescent analysis. Mol Biol Evol 2012;29(8):1917-1932. https//doi.org/l 0.1093/molbev/mss086. Stange M, SOnchez-Villagra MR, Salzburger W, Matschiner M. Bayesian divergence-time estimation with genome-wide single-nucleotide polymorphism data of sea catfishes (Ariidae) supports Miocene closure of the Panamanian isthmus. Syst Biol 2018;67(4):681-699. https//doi. org/10.1093/sysbio/syy006. Bouckaert R, Vaughan TG, Barido-Sottani J, DucWne S, Fourment M, Gavryushkina A, et al. BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis. Plos Comput Biol. 2019; 15(4):e 1006650. hltps//doi.org/10.1371 /journal.pcbi. 1006650. Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. Posterior summarization in Bayesian Phylogenetics using tracer 1.7. Syst Biol 2018,67(5):901 -904. https://dol.org/10.l093/sy5bio/syy032. Bouckaert RR. DensiTree: making sense of sets of phylogenetic trees. Bioinformatics. 2010;26(10):1372-1373. https://dol.Org/10.1093/bioin formatics/btql 10. Rambaut A. Figtree v1.4.4.2021. http://tree.bio.ed.ac.uk/software/figtr ee. Accessed 23 Apr 2021. Korkmaz S, Goksuluk D, Zararsiz G. Mvn: an r package for assessing multivariate normality. RJ 2014,6:151-162. https://doi.org/10.32614/ rj-2014-031 Harrell FEJ. Hmisc: Harrell Miscellaneous. 2021. httpsZ/CRAN.R-project. org/package=Hmisc. Accessed 23 Apr 2021. WeiT, Simko V. R package "Corrplot”: visualization of a correlation matrix. 2021. httpsZ/github.com/taiyun/corrplot Accessed 23 Apr 2021. PagOs J. Analyse Factorielle de Donnees Mixtes. Rev Stat Appl. 2004;4:93-111. L6 S, Josse J, Husson F. FactoMineR: an R package for multivariate analysis. J Stat Softw 2008;25<1 ):1—18. httpsZ/doi.org/10.18637/jss.v025.i01 Cattell RB. The scree test for the number of factors. Multivariate Behav Res. 1966;1:245-76. 85 Baiakhmetov etal. BMC Plant Biology (2021)21:505 Page 20 of 20 102. Kassambara A. Factoextra: extract and visualize the results of multivariate data analyses. 2021. httpsZ/rdrr.lo/cran/factoextra. Accessed 23 Apr 2021. Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Ready to submit your research? Choose BMC and benefit from: • fast, convenient online submission • thorough peer review by experienced researchers in your field • d nubi i ori acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations • maximum visibility for your research over 100M website views per year At BMC, research is always in progress. Learn more i tomedcentral.com/submissions 86 Baiakhmetov et al. BMC Plant Biology (2022) 22:22 https://doi.Org/10.1186/s12870-021 -03357-z BMC Plant Biology CORRECTION Open Access Correction to: Evidence for extensive hybridization and past introgression events in feather grasses using genome-wide SNP genotyping Evgenii Baiakhmetov12*, Daria Ryzhakova2'3, Polina D. Gudkova2 3 and Marcin Nobis1,2* Correction to: BMC Plant Biol 21, 505 (2021) https://doi.org/10.1186/s 12870-021-03287-w Following publication of the original article [1], the author identified an error in Supplementary Materials. Additional File 1, Interactive box plots, is missing. Figures 1, 2, 3, 4, 5, 6 and 7. Additionally, revised and high resolution figures should also be captured. The revised figures are given below: The original article has been corrected. Supplementary Information The online version contains supplementary material available at httpsZ/doi. org/10.1186/s12870-021 -03357-z. Reference 1. Baiakhmetov E, Ryzhakova D, Gudkova PD, et al. Evidence for extensive hybridisation and past introgression events in feather grasses using genome-wide SNP genotyping. BMC Plant Biol. 2021;21:505. https://doi. org/10.1186/s 12870-021-03287-w. Additional file 1. Author details 'institute of Botany, Faculty of Biology, Jagiellonian University, Gronostajowa 3,30-387 Kraków, Poland. 2Research laboratory'Herbarium; National Research Tomsk State University, Lenin 36 Ave, Tomsk 634050, Russia, department of Biology, Altai State University, Lenin 61 Ave, Barnaul 656049, Russia. Published online: 08 January 2022 The original article can be found online at https://d01.0rg/l 0.1186/s 12870-021-03287-w. ’Correspondence: evgenii.baiakhmetov@doctoral.uj.edu.pl; m.nobis@uj.edu. Pi 2 Research laboratory‘Herbarium; National Research Tomsk State University, Lenin 36 Ave., Tomsk 634050, Russia Full list of author information is available at the end of the article BMC 6 The Authorts) 2021 Open Access this article is licensed under a Creative Commons Attribution 4.0 International License, which permits use. sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original authors) and the source, provide a link to the Creative Commons licence, and Indicate If changes were made The images or other third party material in this article are Included In the article's Creative Commons licence, unless Indicated otherwise in a credit line to the materiaL If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http//creativecommonsorg/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (httpZ/creativeco mmonsorg/publlcdomain/zero/1.0/) applies to the data made available In this article, unless otherwise stated in a credit line to the data. 87 Baiakhmetov er al. BMC Plant Biology (2022) 22:22 Page 2 of 5 Fig. 1 The general distribution map of (a) S. baicalensis (yellow), S. capillata (red), S. grandis (green), S. krylovii (blue) and sampling locations (b) in East Kazakhstan and southwestern Siberia (Russia), (c) in southeastern Siberia and (d) in Eastern Kyrgyzstan. The dashed lines Indicate hypothetical borders. The coloured circles depict species found in the numbered locations. The exact coordinates of the locations are presented in the Supplementary Table SI 88 Baiakhmetov et al. BMC Plant Biology (2022) 22:22 Page 3 of 5 Fig. 2 The UPGMA dendrogram (at the top) aligned with the best supported fastSTRUCTURE model K=5 (on the bottom). The genetic distance was calculated using the Jaccard Similarity Coefficient (y-axis, top). Individuals are represented by coloured bars according to the proportion of membership (y-axis, bottom) of a genotype to the respective cluster Fig. 3 The PCoA plot based on genetic distances between samples, a The plot of the two principal axes b The plot of the three principal axes. The pie charts represent the proportions of membership established by fastSTRUCTURE for the best K=5 89 Baiakhmetov et al. BMC Plant Biology (2022) 22:22 Page 4 of 5 Fig. 4 The assignment of Stipa taxa into four hybrid classes according to the posterior probabilities (y-axis) inferred in NewHybrids a 5. baicalensis x S. krylovii. (b) S. capillata x S. krylovii. (c) S. capillata x 5. baicalensis, (d) S. grondis x S. krylovii, (e) S. grandis x 5. baicalensis. Hybrid classes are coloured by black (F1 hybrid), grey (F2), cyan (backcross to the first parental species, BC to parent 1) and pink (backcross to the second parental species, BC to parent 2) Fig.S PCoA plots, best supported STRUCTURE models and localities of the studied populations across four species, a S. baicalensis b S. capillata. c S. grandis. d 5. krylovii 90 Baiakhmetov et al. BMC Plant Biology (2022) 22:22 Page 5 of 5 Fig. 6 Phylogeny and divergence date estimates inferred by SNAPP. Blue coloured trees represent the most probable topology. Numbers at each node represent mean ages of divergence time estimates and the 95% HPD intervals (in the brackets). The black rectangles on the nodes indicate the 95% HPD intervals of the estimated posterior distributions of the divergence times. The red circle indicates the presumed divergence time split set as a reference.The Bayesian posterior probabilities were 1.00 for the nodes with the shown 95% HPD intervals.The scale shows divergence time in Mya Fig. 7 The factor analysis of mixed data performed on 17 quantitative and six qualitative characters of the five examined species of Stipa, a Plot of the two principal axes, b Plot of the three principal axes. The pie charts represent the proportions of membership established by fastSTRUCTURE for the best K=5 91 Supplementary material Evidence for extensive hybridisation and past introgression events in feather grasses using genome-wide SNP genotyping Evgenii Baiakhmetov1,2*, Daria Ryzhakova2,3, Polina D. Gudkova2,3, Marcin Nobis1,2* 1 Institute of Botany, Faculty of Biology, Jagiellonian University, Gronostajowa 3, 30-387 Kraków, Poland 2 Research laboratory ’Herbarium', National Research Tomsk State University, Lenin 36 Ave., 634050 Tomsk, Russia 3 Department of Biology, Altai State University, Lenin 61 Ave., 656049 Barnaul, Russia ‘Corresponding Authors: Evgenii Baiakhmetov1,2 Gronostajowa 3, Krakorv, 30-387 Kraków', Poland Email address: evgeuii.baiakhmetov@doctoral.uj.edu.pl Marcin Nobis1,2 Gronostajowa 3, Kraków7, 30-387 Kraków7, Poland Email address: m.nobis@uj.edu.pl 92 Supplementary Table SI. List of samples used in tlie study. Taxon Voucher No in Locality No. description Latitude Longitude Altitude Date Collector (assigned by geuetics) TKKRA £ cap* lia ta 002223 Locality 1. N 47*11*17 9" E 82°27'59 0" 2978 m 21 06 2005 Y A Kotukhov S. capi lla ta 002225 Kazakhstan. (approx) (approx) S capillata 002226 Tarbagatai Mountains. S. capillata * £ knlovii 002222 Kill ebet Tarbagatay S capillata * £ knlosii 002224 S capillata » £ knlovii 002280 Locality 2. N 49°06'15 .6" E 84°33'30 8" 1400 m 18.08 2004 YAKomkhov S. capillata * £ knlovii 002281 Kazakhstan. (approx) (approx) 5 capillata * S. knlosii 002282 Kalonkaragay District. 5 capillata * £ knlosii 002284 NW of Khrebet Sarviusaktv S capillata * £ knlosv 002285 S capillata “ £ knlosii 002287 S. capillata ■ £ knlovii 002288 S capillata * £ knlosii 002289 S. capillata 000329 Locality 3. N 50°l3-58.3- E 87°55"48.7" 1495 m 23 08 2018 L. Sokolova S capillata 000330 Russia, S capillata 000331 Altai Republic. £ capillata 000332 Kuray £ capillata 000342 S capillata 000343 S. capillata 000344 S capillata 000827 Locality 4. N 54°15*32.1" E 81 “39*29 1" 95m 27 09 2018 E Knuchkova S. capillata 000831 Russia. S capillata 000838 Novosibirskaya oblast. £ capillata 000841 Kirza S capillata 000842 S capillata 000843 £ capillata 000849 S. capillata 001304 Locality 5 N 53°27'43 8" E 90°24'03 8" 844m 22.07.2008 M.OIonova S. capillata 001305 Russia. P. Gudkova S. capillata 001306 Republic of Khakassia. S. baicalemis 001308 Aski/sky Distnct. S baicalensu 001324 NW shore of Lake Bulankul' S capillata 001310 Locality 5. N 53°25'37.7" E 90°34*39.3H 452 m 2407.2008 P Gudkova Russia. Republic of Khakassia. Askizsky Distnct. ca 11 km SE of Lake Bulankul' 5 capillata 001393 Locally 4, N 53021'32.9" E 90°33’27,3" 421 m 2307 2008 P Gudkova S. capillata 001395 Russia. S capillata » £ baicalmsis 001399 Republic of Kliakassia. S capillata " £ knlovii 001394 Askizsky District. £ grandis 001396 ca 14 km NW of Kamyshta S grandis 001397 S grandis 001398 S knlosii 001392 5 bai calms is 0453558 Locality 6. N 54 “02 05 0" E90°l6(Mr 742 m 21 07 2008 P Gudkova S bai calms is 0454700 Russia. (approx ) (approx) M(Monova S capillata » £ baicalmsis 0453565 Republic of Kliakassia. £ capillata * £ baicalmsis 0453566 Sturinskiy Distnct. S capillata « £ baicalmsis 0453567 6 km N of litbmskaya S capillata ■ £ baicalmsis 0453568 S baicalmsis » £ knlosii 0454628 Loathly.! N 54°28'18.0" E 89‘>27'42.(T 543 m 08 08 2013 AEbel Russia. Republic of Khakassia. Shinnskiy Distnct. Yefteiukuio 5 capillata « £ baicalmsis 0454630 Lpcaiyy.7. N 54*2738 0" E89°28'150.15 indicating high levels of differentiation are in bold type. Species and population No Popí Pop2 Popí Pop4 Pop5 Popó Pop7 Pop8 S. baicalensis, Pop2 0.4135 S. baicalensis, Pop3 0.6369 0.5759 S. baicalensis, Pop4 0.3181 0.3981 0.6338 S. capillala, Pop2 0.4829 S. capillata, Pop3 0.5094 0.4180 S. capillala, Pop4 0.4101 0.2803 0.3586 S. capillata, Pop5 0.2995 0.2361 0.3001 0.1826 S. capillata. Popó 0.4086 0.2793 0.3562 0.2023 0.1280 5. capillata, Pop7 0.5108 0.3941 0.4243 0.3045 0.2633 0.3191 S. capillata, Pop8 0.5037 0.3750 0.4228 0.3141 0.2505 0.3051 0.3858 S. capillata, Pop9 0.4806 0.3379 0.3700 0.2554 0.2031 0.2569 0.3542 0.3483 S. grandis, Pop2 0.4212 S. grandis, Pop3 0.5169 0.4544 S. grandis, Pop4 0.6817 0.5633 0.6067 S. grandis, Pop5 0.3033 0.2530 0.3185 0.4092 S. grandis. Popó 0.2674 0.2477 0.2669 0.3916 0.1037 S. grandis, Pop7 0.3308 0.3094 0.4260 0.5139 0.2487 0.23586 S. grandis, Pop8 0.3709 0.3425 0.4484 0.5488 0.2760 0.2536 0.0826 S. krylovii, Pop2 0.1245 S. krylovii, Pop3 0.1476 0.0853 S. krylovii, Pop4 0.1039 0.0565 0.0917 S. krylovii, Pop5 0.1608 0.1215 0.1539 0.1039 S. krylovii. Popó 0.1176 0.0753 0.1071 0.0583 0.1172 S. krylovii, Pop7 0.1753 0.1440 0.1632 0.1291 0.1867 0.1333 S. kry’lovii, Pop8 0.3998 0.4065 0.4405 0.3772 0.4330 0.3717 0.4589 All Fst /»-values estimated usirrg 1,000 bootstrap replicates were significant (<0.05). 100 Table S4. Contribution (%) by dimension of each character (abbreviations according to Table 3) in FAMD. The first five characters contributing the most are in bold type. Abbreviations of the qualitative variables and their contributions to the principal axes are underlined. Character Dimension 1 Dimension 2 Dimension 3 Dimension 4 CollL 10.58619639 0.07367645 0.202549383 0.18640246 DDL 10.32673209 0.13019599 2.560111433 0.183466 AL 9.811198066 1.5076406 0.124465739 1.07464193 Col2L 8.702219729 3.15270516 2.323643064 015100557 CL 8.42450895 1.28938092 0.932046305 0.95617566 CN 7.728393001 2.09548241 3.400721764 3.20582721 LG 6.48681185 0.64031939 5.19796294 1.30962683 DVL 5.893750308 0.71262269 0.000433553 6.41863223 AdSVL 5.728157162 3.66953586 0.693651975 6.27220701 SL 4.91977364 6.13121788 0.071357255 5.81027345 LHTA 3.273728594 8.60372028 3.864392941 0.09398638 PHBN 3.260074501 3.42809654 9.401227363 6.22797984 WVS 3.189179462 4.8203688 0.178227035 0.02115708 HLColl 3.169875893 0.12185596 23.728834 0.71345245 AG 3.087430743 0.0110623 26.33751993 0.08913552 LigC 1.868036208 11.29939534 0.28022867 0.01064054 HTTA 1.814224832 11.95190258 0.152983336 2.23887609 LigTV 0.695227504 13.68239344 0.024167264 2.21552371 CBL 0.55201756 2.85797371 6.619372557 4.77825171 AbSVL 0.308017226 13.32355635 3.670205699 0.75242778 LHV 0.146868843 5.7926466 0.354540743 19.53995161 HLColl 0.020173036 0.95448297 9.402285559 19.97205261 LHD 0.007404409 3.74976776 0.479071495 17.77830633 101 Table S5. The assigned species names based on morphological and molecular data. Mismatches are shown in bold type. Herbarium ID number Species name assigned by morphology Species name assigned by molecular data 001304 S. baicalensis S. capillata 001305 S. baicalensis S. capillata 001101 S. baicalensis S. capillata 001113 S. baicalensis S. capillata 001121 S. baicalensis S. capillata 001061 S. baicalensis S. capillata 001073 S. baicalensis S. capillata 002225 S. baicalensis S. capillata 002226 S. baicalensis S. capillata 0453565 S. baicalensis S. capillata x S. baicalensis 0453566 S. baicalensis S. capillata x S. baicalensis 0453567 S. baicalensis S. capillata * S. baicalensis 0453568 S. baicalensis S. capillata x S. baicalensis 0454630 S. baicalensis S. capillata x S. baicalensis 0454631 S. baicalensis S. capillata x S. baicalensis 001062 S. baicalensis S. capillata x S. baicalensis 001063 S. baicalensis S. capillata x S. baicalensis 001064 S. baicalensis S. capillata x S. baicalensis 001069 S. baicalensis S. capillata x S. baicalensis 000447 S. baicalensis S. baicalensis x S. krylovii 0454717 S. baicalensis S. baicalensis x S. krylorii 0454814 S. baicalensis S. baicalensis x S. krylovii 0454816 S. baicalensis S. baicalensis x s. krylorii 0477009 S. baicalensis S. baicalensis x 5. krylorii 0477063 S. baicalensis S. baicalensis x S. krylorii 0477064 S. baicalensis S. baicalensis * S. krylorii 0477065 S. baicalensis S. baicalensis x S. krylorii 002222 S. baicalensis S. capillata x S. krylorii 002224 S. baicalensis S. capillata x S. krylorii 002280 S. baicalensis S. capillata x S. krylorii 002281 S. baicalensis S. capillata x S. krylorii 002282 S. baicalensis S. capillata x S. krylorii 002284 S. baicalensis S. capillata x S. krylorii 002285 S. baicalensis S. capillata x S. krylorii 002287 S. baicalensis S. capillata x S. krylorii 002288 S. baicalensis S. capillata x S. krylorii 002289 S. baicalensis S. capillata x S. krylorii 0453558 S. baicalensis S. baicalensis 0454700 S. baicalensis S. baicalensis 002291 S. baicalensis S. baicalensis 0477904 S. baicalensis S. baicalensis 0477350 S. baicalensis S. baicalensis 0426154 S. baicalensis S. baicalensis 0476911 S. baicalensis S. baicalensis 0477203 S. baicalensis S. baicalensis 0477204 S. baicalensis S. baicalensis 0478619 S. baicalensis S. baicalensis 0478620 S. baicalensis S. baicalensis 0477175 S. baicalensis S. baicalensis 102 0477176 S. baicalensis S. baicalensis 0477177 S. baicalensis S. baicalensis 0477178 S. baicalensis S. baicalensis 0477179 S. baicalensis S. baicalensis 0477180 S. baicalensis S. baicalensis 0477181 S. baicalensis S. baicalensis 000446 S. baicalensis S. baicalensis 000453 S. baicalensis S. baicalensis 000462 S. baicalensis S. baicalensis 000469 S. baicalensis S. baicalensis 000471 S. baicalensis S. baicalensis 0454721 S. baicalensis S. baicalensis 0454713 S. baicalensis S. baicalensis 0454811 S. baicalensis S. baicalensis 0477006 S. baicalensis S. baicalensis 0477012 S. baicalensis S. baicalensis 0477026 S. baicalensis S. baicalensis 0477027 S. baicalensis S. baicalensis 0477058 S. baicalensis S. baicalensis 002184 S. baicalensis x 5. krylorii S. baicalensis 0454628 S. baicalensis x S. bylovii S. baicalensis * 5. kryl o vi i 0454720 S. baicalensis x S krylovii S. baicalensis x S. krylovii 002220 S. baicalensis * S. kiylovii S. baicalensis x S. ktylovii 0451256 S. capillata S. capillata x S. krylorii 0454736 S. capillata S. capillata x S. krylovii 001544 S. capillata S. capillata x S. krylovii 001394 S. capillata S. capillata x S. krylorii 001079 S. capillata S. capillata x S. baicalensis 001306 S. capillata x S. krylovii S. capillata 001099 S. capillata « S. krylowi S. capillata 001100 S. capillata x S. krylorii S. capillata 001102 S. capillata x S. krylorii S. capillata 001106 S. capillata x S. krylorii S. capillata 001107 S. capillata x S. krylorii S. capillata 001111 S. capillata x S. krylorii S. capillata 001112 S. capillata x S. krylorii S. capillata 000332 S. capillata* S. krylorii S. capillata 000329 S. capillata x S. baicalensis S. capillata 001399 S. capillata * S. grandis S. capillata * S. baicalensis 002316 S. capillata S. capillata 001310 S. capillata S. capillata 001393 S. capillata S. capillata 001395 S. capillata S. capillata 0454271 S. capillata S. capillata 0456693 S. capillata S. capillata 0475125 S. capillata S. capillata 0496240 S. capillata S. capillata 001093 S. capillata S. capillata 001094 S. capillata S. capillata 001095 S. capillata S. capillata 001097 S. capillata S. capillata 001098 S. capillata S. capillata 001108 S. capillata S. capillata 1Q3 001110 S. eapiUata S. eapillata 001117 S. eapillata S. eapillata 001068 S. eapillata S. eapillata 001071 S. eapillata S. eapillata 001072 S. eapillata S. eapillata 001074 S. eapillata S. eapillata 001077 S. eapillata S. eapillata 001078 S. eapillata S. eapillata 001080 S. eapillata S. eapillata 0451257 S. eapillata S. eapillata 0454728 S. eapillata S. eapillata 0454740 S. eapillata S. eapillata 0454747 S. eapillata S. eapillata 002505 S. eapillata S. eapillata 000531 S. eapillata S. eapillata 000539 S. eapillata S. eapillata 000543 S. eapillata S. eapillata 000547 S. eapillata S. eapillata 000549 S. eapillata S. eapillata 000555 S. eapillata S. eapillata 000330 S. eapillata S. eapillata 000331 S. eapillata S. eapillata 000342 S. eapillata S. eapillata 000343 S. eapillata S. eapillata 000344 S. eapillata S. eapillata 000827 S. eapillata S. eapillata 000831 S. eapillata S. eapillata 000838 S. eapillata S. eapillata 000841 S. eapillata S. eapillata 000842 S. eapillata S. eapillata 000843 S. eapillata S. eapillata 000849 S. eapillata S. eapillata 002223 S. eapillata S. eapillata 0457716 A S. glareosa S. glareosa 0457716B S. glareosa S. glareosa 0454146 S. glareosa S. glareosa 0454147 S. glareosa S. glareosa 0426149 3 S. glareosa S. glareosa 0426149 4 S. glareosa S. glareosa 0426149 5 S. glareosa S. glareosa 0426150 S. glareosa S. glareosa 0477208 1 S. glareosa S. glareosa 0477208 2 S. glareosa S. glareosa 0477209 3 S. glareosa S. glareosa 0477209 4 S. glareosa S. glareosa 0477252 S. glareosa S. glareosa 000438 S. grandis S. grandis x S. baicalensis 000948 S. grandis S. grandis x S. krylovii 000956 S. grandis S. grandis x S. krylovii 000440 S. grandis x S. krylovii S. baicalensis 000441 S. grandis * S. krylovii S. baicalensis 000430 S. grandis S. grandis 000432 S. grandis S. grandis 104 000439 S. grandis S. grandis 000444 S. grandis S. grandis 000445 S. grandis S. grandis 000460 S. grandis S. grandis 000467 S. grandis S. grandis 0454710 S. grandis S. grandis 0454712 S. grandis S. grandis 0454751 S. grandis S. grandis 0454802 S. grandis S. grandis 0454803 S. grandis S. grandis 0454815 S. grandis S. grandis 001396 S. grandis S. grandis 001397 S. grandis S. grandis 001398 S. grandis S. grandis 000402 S. grandis S. grandis 000405 S. grandis S. grandis 000420 S. grandis S. grandis 000421 S. grandis S. grandis 000426 S. grandis S. grandis 000427 S. grandis S. grandis 000428 S. grandis S. grandis 000360 S. grandis S. grandis 000361 S. grandis S. grandis 000363 S. grandis S. grandis 000364 S. grandis S. grandis 000365 S. grandis S. grandis 000367 S. grandis S. grandis 000368 S. grandis S. grandis 0426162 S. grandis S. grandis 0477312 S. grandis S. grandis 0477313 S. grandis S. grandis 0477314 S. grandis S. grandis 0477315 S. grandis S. grandis 0477316 S. grandis S. grandis 0454748 S. grandis S. grandis 0454754 S. grandis S. gi'andis 001419 S. grandis S. grandis 001420 S. grandis S. grandis 002221 S. grandis S. grandis 0477268 S. grandis S. grandis 0477271 S. grandis S. grandis 0477273 S. grandis S. grandis 0477274 S. grandis S. grandis 0477281 S. grandis S. grandis 000944 S. grandis S. grandis 000946 S. grandis S. grandis 000949 S. grandis S. grandis 000959 S. grandis S. grandis 000963 S. grandis S. grandis 000966 S. grandis S. grandis 000967 S. grandis S. grandis 0453556 S. grandis S. grandis 0477278 S. grandis * S. bylovii S. grandis * S. bylovii 105 0477279 S. grandis x 5. krylovii S. grandis x 5. krylovii 000442 S. krylovii S. baicalensis x S. krylovii 000443 S. krylovii S. baicalensis x S. krylovii 0477007 S. krylovii S. baicalensis x S. krylovii 0477008 S. krylovii S. baicalensis x S. krylovii 0477011 S. krylovii S. baicalensis x S. krylovii 0477060 S. krylovii S. baicalensis x S. krylovii 000950 S. krylovii S. grandis x S. krylovii 000449 S. krylovii S. grandis x s. krylovii 001065 S. krylovii S. capillala x S. baicalensis 001070 S. krylovii S. capillala x S. baicalensis 001308 S. krylovii x S. baicalensis S. baicalensis 001324 S. krylovii x s. baicalensis S. baicalensis 0477832 S. krylovii S. krylovii 000433 S. krylovii S. krylovii 000435 S. krylovii S. krylovii 000459 S. krylovii S. krylovii 000461 S. krylovii S. krylovii 000463 S. krylovii S. krylovii 001392 S. krylovii S. krylovii 001545 S. krylovii S. krylovii 0451252 S. krylovii S. krylovii 0451253 S. krylovii S. krylovii 0454729 S. krylovii S. krylovii 0454784 S. krylovii S. krylovii 0454818 S. krylovii S. krylovii 0478099 S. krylovii S. krylovii 0478100 S. krylovii S. krylovii 0478101 S. krylovii S. krylovii 0478103 S. krylovii S. krylovii 0477303 S. krylovii S. krylovii 0477306 S. krylovii S. krylovii 0477307 S. krylovii S. krylovii 0477308 S. krylovii S. krylovii 0477309 S. krylovii S. krylovii 0477311 S. kry lovii S. krylovii 0454741 S. krylovii S. krylovii 0454746 S. krylovii S. krylovii 0454755 S. krylovii S. krylovii 0454756 S. krylovii S. krylovii 0454757 S. krylovii S. krylovii 0454804 S. krylovii S. krylovii 0454805 S. krylovii S. krylovii 001421 S. krylovii S. krylovii 0477265 S. krylovii S. krylovii 0477269 S. krylovii S. krylovii 0477270 S. krylovii S. krylovii 0477275 S. krylovii S. krylovii 0477280 S. kry lovii S. krylovii 0477282 S. krylovii S. krylovii 0477285 S. krylovii S. krylovii 000941 S. krylovii S. krylovii 000942 S. krylovii S. krylovii 106 000957 S. krylov i S. krylovii 000969 S. krylov i S. krylovii 000970 S. krylov i S. krylovii 000972 S. kry’lov i S. krylovii 0454646 S. krylov i S. krylovii 0477207 S. krylov i S. krylovii 000538 S. kiylov i S. krylovii 000582 S krylov, i S. krylovii 000585 S. krylov i S. krylovii 000586 S. krylov i S. krylovii 000587 S. krylo\’ i S. krylovii 000588 S. krylov i S. krylovii 000590 S. kry’lov i S. krylovii 000591 S. krylov i S. krylovii 000372 S. krylov i S. krylovii 000373 S. krylov i S. krylovii 000384 S. krylov i S. krylovii 000385 S. krylov i S. krylovii 000388 S. krylov i S. krylovii 000391 S. krylov i S. krylovii 000392 S. kry’lov i S. kry’lovii 0495094 S. krylov i S. krylovii 0495095 S. krylov i S. krylovii 0495096 S. krylov i S. krylovii 0495097 S. krylov i S. kry’lovii 0495098 S. krylov i S. krylovii 0495099 S. krylov i S. krylovii 0495100 S. krylov i S. krylovii 0468522 S. krylov i S. krylovii 0495122 S. krylov i S. krylovii 0470570 S. kry’lov i S. krylovii 0470573 S. krylov i S. krylovii 0469167 S. krylov i S. krylovii 0469168 S. krylov i S. krylovii 0496246 S. kiylov i S. krylovii 0469181 S. kry’lov i S. kry’lovii 0469188 S. kiylov i S. krylovii 0469189 S. kiylov i S. krylovii 0469194 S. kry’lov i S. kry’lovii 0469195 S. kry’lov i S. krylovii 0469202 S. kr\’lov i S. krylovii 107 Supplementary Figure SI Venn diagram representing polymorphic SNPs among four pure Stipa species. The admixed individuals and S. glareosa, which did not show patterns of hybridisation, were omitted in the metric's calculation. 108 109 Supplementary Figure S2 Delta K values calculated by Evanno's method across four species, (a) S. baiealensis. (b) S. capillata. (c) 5. grandis. (d) S. krylovii. Supplementary Figure S3 Phytogeny (at the top) and divergence date estimates at the species level (on the bottom) inferred by SNAPP. The scale shows divergence time in Mya. The red circles indicate nodes with the Bayesian posterior probabilities (BPP) > 0.80. The lower-case letters refer to the embedded table containing data regarding the exact estimates of the divergence times (in kya), BPPs and 95% HPD intervals. 110 Supplementary Figure S4 Correlation matrix of the studied morphological characters (abbreviations according to Table 3). Colour intensity and the size of the circle are proportional to the correlation coefficients (displayed in the circle). Positive correlations are blue while negative are red. All /^values of Pearson correlations were < 0.01. 111 112 Supplementary Figure S5 Factor analysis of mixed data performed on 17 quantitative and six qualitative characters of the five examined species of Stipa. (a) Plot of the principal axes one and two. (b) Plot of the principal axes one and three, (c) Plot of the principal axes one and four, (d) Plot of the principal axes two and three, (e) Plot of the principal axes two and four, (f) Plot of the principal axes three and four. The pie charts represent the proportions of membership established by fastSTRUCTURE for the best K=5. Supplementary Figure S6 113 114 115 Notched boxplot demonstrating the mean (white circle), the median (dark black line). 95% confidence interval around the median (notch), inter-quartile ranges (25% to 75%), whiskers (5% and 95%) and minimum and maximum measurements (crosses) of quantitative characters (a-q) for the sntdied species. Statistical significance was tested by Wilcoxon rank-sum test for post hoc group comparisons with Bonferroui correction, p < 0.001,/? < 0.01,p < 0.05,/? < 0.1 and p < 1 noted as '***', and no symbol, respectively. Due to the small sample size,/»-values cannot be properly estimated for S. grandis * S. bylovii and S. grandis x S. baicalensis. Each dot represents an observation. 116 Bar charts displaying frequencies of the qualitative characters, (a) AbSVL. (b) AdSVL. (c) HTTA. (d) AG. (e) CN. (f) PIIBN. Supplementary1 Figure S7 Final conclusions and perspectives Feather grasses have been studied for the last three centuries; however, a limited set of morphological traits available for their description has complicated establishing reliable boundaries between taxa. A large leap forward has been made only recently when molecular markers have let botanists to redefine the tribe Stipeae and to point out the monophyly of the genus Stipa in particular. Nonetheless, due to the presence of hybridisation and introgression events between species from different sections the infrageneric classification of feather grasses is still demanding a comprehensive reassessment. The current thesis shows that the delimitation of pure and admixed taxa solely using morphology is challenging. On the other hand, integrative taxonomy that applies molecular and phenotypic data may be beneficial for studies in hybrid zones. Importantly, the dissertation has outlined the usefulness of the DArTseq technique for inferring hybridisation and introgression processes within feather grasses. The large-scale genomic data provided here has aided in establishing divergence time within the genus and disclosed past introgression events. Moreover, the dissertation demonstrates that feather grasses may be a suitable genus to study hybridisation and introgression events in nature. The current provisional estimate of Stipa taxa with a hybrid origin is around 30% (Nobis et al., 2019; Nobis et al., 2020). Nevertheless, this claim must be carefully scrutinised. As it was shown in the thesis some taxa can share intermediate characteristics between putative parental specimens; however, genetically they can represent distinct species, thereby refuting the initial hypotheses of hybrid origin. On the other hand, the studies indicate that in some cases molecular markers have the potential to identify thus far unsuspected hybrids. For instance, a hybridisation event was suggested between morphologically and genetically distant species S. bungeana and probably S. glareosa. The latter taxon was absent in the analyses and this case merits a further investigation. Moreover, recently it has been demonstrated that some feather grasses may represent cryptic species (Nobis et al., 2019; Nie, 2020) and there is a need for clarification if this phenomenon is common within the genus. Additionally, taking into consideration that hybridisation is an important mechanism driving evolution in Stipa, new studies should incorporate as many taxa as possible, particularly in putative hybrid zones. Possibly, even grasses that were previously considered as a part of the genus Stipa. Despite feasible pre- 117 zygotic and post-zygotic barriers for speciation (Vallejo-Marin & Hiscock, 2016), intergeneric hybrids are well reported in plants (Karpechenko, 1927; Knobloch, 1972; Clarkson, 1988; Couturon et al., 1998; Smith et al., 2013; Anghelescu et al., 2021; Hu et al., 2021). In grasses natural hybrids were well-documented between genera Triticum L. and Aegilops L. For instance, diploid species T. urartu Thumanjan ex Gandilyan (2n=2x=14, AA) and A. speltoides Tausch. (2n=2x=14, BB) formed T. turgidum L. (2n=4x=28, genome constitution AABB), while T. aestivum L. (2n=6x=42, AABBDD) combines genomes of T. urartu, A. speltoides and A. tauschii Coss. (2n=2x=14, DD; Matsuoka et al., 2014). To date, the vast majority of Stipa species have been considered to be tetraploids (Romaschenko et al., 2012; Tkach et al., 2021). The draft genome project has highlighted that feather grasses may constitute of two very distinct genomes. Thus, as intergeneric hybrids were not previously reported in the genus, this topic due separate research that may contribute to knowledge on natural hybridisation and speciation in Poaceae. In addition, while numerous intriguing specimens of putative hybrid origin in the field are collected by experienced taxonomists, some introgressed individuals may be still unobserved. Consequently, it may decrease the predictive power of studies that are based on such collections regarding, e.g., levels of reproductive isolation in these species and patterns of speciation. Although the overall sequencing costs are still significant for large-scale works, herbarium collections (ideally with well-preserved leaves stored separately in silica gel) from hybrid zones should comprise as many samples as possible to make it feasible to address more research questions in future. Importantly, there are no studies so far to show if hybridisation and introgression events affect chromosome numbers, karyotypes and genome size of admixed individuals in feather grasses. Do the newly formed hybrids and subsequent backcrosses possess the same ploidy as their parental species that potentially may lead to homoploid speciation or is there a change in ploidy level (allopolyploid speciation)? The question remains open. Generally, homoploidy is recognised to be rare in plants (Yakimowski & Rieseberg, 2014), while allopolyploid hybrid speciation is more common (Barker et al., 2016; Abbott et al., 2016). In Poaceae allopolyploid species were evidenced, e.g., for wheats, goatgrasses, oats, fescues and ryegrasses (Jenczewski & Alix, 2004). While the origin of feather grasses may have an allotetraploid nature (2n=4x=44; Romaschenko et al., 2012; Tkach et al., 2021), there is a study on S. baicalensis, S. grandis and S. krylovii from Inner Mongolia steppe that reports a diploid nature of these species (2n=2x=44; Wu et al., 2009). Has the cytological diploidisation been accomplished in genomes of feather 118 grasses? This is another topic to examine in future. In addition, there are still few studies regarding genome sizes within the genus. Currently, only one work has provided such estimates for 10 Stipa taxa (Smarda et al., 2019). According to the research, the expected monoploid size varies within the genus in the range of 547-669 Mb. Moreover, one work on 16 populations of 13 Stipa taxa showed that the studied specimens differed significantly in their meiotic characteristics (Sheidai et al., 2006). Thereby, the further works on hybridisation in Stipa should be supplemented with cytogenetic studies aiming to count chromosomes and assess their characteristics as well as with determining genome sizes, e.g., via flow cytometry. Lastly, tests of pollen viability and size of grains may also contribute to knowledge regarding the status of newly formed hybrids and introgressed individuals (Nobis et al., 2017; 2019). If homoploid and allopolyploid hybrids exist in Stipa, do they possess the same level of fertility? Do introgressed individuals remain fertile? One more point should be addressed. The presence of hybrids in feather grasses complicates phylogenetic reconstructions of the genus. To date, several molecular markers have been applied, e.g., nuclear ITS and the plastid trnK (Hamasha et al., 2012; Romaschenko et al., 2012), as well the nuclear intergenic spacer (IGS) region (Krawczyk et al., 2017) and marker sets derived from whole chloroplast genomes (Krawczyk et al., 2018). Nonetheless, these markers are still unable to discriminate all taxa, causing unresolved branching in the reconstructed trees. I believe that to properly resolve the phylogenetic issue of Stipa, only verified pure species should be analysed. The DArTseq technique used in the thesis provides a 100-fold higher number of markers than in the previous genomic studies and may greatly contribute to the detection of such pure taxa. On the other hand, DArTseq itself may not be fully suitable for phylogenetic estimates. The DArT markers were designed to target active regions of the genome (Kilian et al., 2012) and have been widely used in population studies for commercially important plant species (Simko et al., 2012; Alam et al., 2018; Bello et al., 2019). When it comes to analyses at infrageneric levels and, especially, for an entire genus the number of markers decreases significantly, and such results should be treated with caution. A promising approach for phylogeny reconstruction has been recently proposed for flowering plants. A universal probe set for targeted sequencing of 353 nuclear genes, also known as the Angiosperms353 probe set, aims to be useful for phylogenetic studies from the species level to higher-order groups (Johnson et al., 2019). Nonetheless, currently this approach has some challenges, e.g., the analytical handling of gene duplications/losses to achieve 119 proper identification and separation of paralogs or homeologs (Baker et al., 2021). Another method that is worth a try is genome skimming that implies sequencing of an entire genome with low coverage (Straub et al., 2012). Depending on the final coverage, such a skimming may provide complete plastid and mitochondrial genomes, as well as nuclear ribosomal DNA and low-copy nuclear loci. Finally, as stated in the article on the draft genome of feather grasses, nowadays nearly complete genomes may be obtained using the third-generation sequencing. Nevertheless, while single-molecule real-time (Eid et al., 2009) and nanopore sequencing (Clarke et al., 2009) technologies may provide an exceptional number of markers, they still remain high-priced for studies on non-model species. Further, I would like to express my thoughts on how the methodological framework can be improved in next studies. Firstly, in the Era of Open Science ideally all data collected for a particular non-commercial research project must be freely available online. This may largely facilitate access to the original data that subsequently may be reused in a various number of studies including meta-analyses. Additionally, such an access provides a great source for scientists worldwide regardless of funding opportunities in their countries. To date, several platforms with phenotypic data have been introduced, e.g., a worldwide database TRY hosted by the Max Planck Institute for Biogeochemistry (Kattge et al., 2019), a functional trait database for Mediterranean Basin plants (Tav§anoglu & Pausas, 2018) and a curated plant trait database for the Australian flora (Falster et al., 2021). On the other hand, for genera where hybridisation and introgression events are frequently observed, morphological data preferably should be supported by molecular markers. Otherwise, it may lead to misinterpretation of the data. The same issue may be addressed for molecular data stored in the GenBank maintained by the National Center for Biotechnology Information (Sayers et al., 2022) and other databases, e.g., in the European Molecular Biology Laboratory's European Bioinformatics Institute (Amid et al., 2019) or in the DNA Data Bank of Japan (Ogasawara et al., 2020). In the lack of taxonomical expertise molecular sequences stored in the repositories, also may lead to data distortion, e.g., issues in phylogeny reconstruction or taxa identification via DNA barcodes. Thus, I suggest scrutinising plant specimens from hybrid zones by integrative approaches prior to sharing the final data via public databases. Another aspect that needs to be considered in relation to reproducible science is the digitisation of morphological traits. In the past decade there has been a boost in studies 120 aiming to capture herbarium images and provide digital biodiversity data to the public (Wieczorek et al., 2012; Tegelberg et al., 2014; Harris & Marsico, 2017; Borsch et al., 2020; Powell et al., 2021; GBIF, 2022). This data can be used in numerous works, e.g., to study shifts in plant phenology associated with climate change, to generate species distribution models or to be used with other resources to address novel questions in plant biology (Soltis, 2017). Moreover, with the advent of computational approaches that facilitate the finding of predictive patterns in data, it has become possible to automate processes of morphometry. Currently, machine learning (ML) has been used in various studies for data collection from herbarium specimens. For instance, among research topics that can be solved with ML are evaluating plant extinction risks/conservation status, herbarium metadata extraction, species identification and classification, or phenological features analyses (Rocchetti et al., 2021). Regarding morphometric data, ML was used, e.g., in such genera as Tilia L. (Corney et al., 2012), Riccardia Gray (Reeb et al., 2018), Coreopsis L. (Lorieul et al., 2019), Anemone L. and Trillium L. (Davis et al., 2020). Although the grass traits are seemingly hard to digitalise for the subsequent measurements (Weaver et al., 2020), there is a study that utilised such an approach. Specifically, seven grass genera were included in the research namely, Andropogon L., Hyparrhenia Andersson ex Fourn., Schizachyrium Nees, Elymandra Stapf, Diheteropogon Stapf, Monocymbium Stapf and Exotheca Andersson (McAllister et al., 2019). Nonetheless, not all data was collected via fully automated approaches as several qualitative characteristics, e.g., awns (geniculate or straight) or prickle hairs (hair-like, tooth-like or absent) were still assessed manually. In addition, some quantitative traits were measured with a digital approach. Thus, while phenological data still largely relies on botanical experts, I believe that in the near future machine learning algorithms may help to facilitate data collection (at least partially) in feather grasses. To sum up, feather grasses may provide numerous research topics not only for agrostologists, but also for a wider audience of researchers. The plethora of themes may start from nature lovers that study ornamental plants such as, e.g., S. capillata, S. pulcherrima and S. pennata, and end up with works on soil remediation processes (Brunetti et al., 2009; Moameri et al., 2018) and morphokinematics (Yanez et al., 2018). I believe that the results, discussion and future perspectives present here will contribute to the overall knowledge on the genus and, more specifically, on hybridisation and introgression processes. 121 Acknowledgements I would like to dedicate this thesis to my mom and dad, who back in the days of Perestroika, were PhD students but never had a chance to defend their dissertations because of me. Additionally, I desire to express my gratitude to the Department of Genetics of the Tomsk State University (TSU), the Institute of Medical Genetics (Tomsk) and the Research laboratory 'Herbarium' of the TSU for keeping me in academia; to prof. dr hab. Joanna Rutkowska, the Project PO WER WIN and all taxpayers in Poland for letting my PhD position be feasible; to all staff in the Jagiellonian University and especially in the Institute of Botany for their hospitality and assistance. Last but not least, I have to thank my supervisor, prof. dr hab. Marcin Nobis, and co-supervisor, dr Polina D. Gudkova, for letting me work on this intriguing topic and remarkable fieldwork in Kazakhstan. P.S. In case there are errors in the thesis (grammatical, logical or other sorts of blunders), please do not shoot the pianist, he was doing his best. 122 References Abbott RJ, Barton NH & Good JM. Genomics of hybridization and its evolutionary consequences. Molecular Ecology. 2016;25(11):2325-32. https://doi.org/10.1111/mec.13685 Alam M, Neal J, O’Connor K, Kilian A & Topp B. Ultra-high-throughput DArTseq-based silicoDArT and SNP markers for genomic studies in macadamia. PLOS One. 2018;13:e0203465. https://doi.org/10.1371/journal.pone.0203465 Amid C, Alako BTF, Kadhirvelu V, Burdett T, Burgin J, Fan J et al. The European Nucleotide Archive in 2019. Nucleic Acids Research. 2020;48:70-76. https://doi.org/10.1093/nar/gkz1063 Anghelescu NEDG, Kertesz H, Constantin N, Simon-Gruita A, Dufä-Cornescu G, Pojoga MD et al. New intergeneric orchid hybrid found in Romania x Pseudorhiza nieschalkii (Senghas) PF Hunt nothosubsp. siculorum H Kertesz & N Anghelescu, 2020. PLOS One. 2021;16(5):e0241733. https://doi.org/10.1371/journal.pone.0241733 Arranz-Otaegui A, Gonzalez Carretero L, Ramsey MN, Fuller DQ & Richter T. Archaeobotanical evidence reveals the origins of bread 14,400 years ago in northeastern Jordan. Proceedings of the National Academy of Sciences of the United States of America. 2018;115:7925-7930. https://doi.org/10.1073/pnas.1801071115 Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLOS One. 2008;3:e3376. https://dx.doi.org/10.1371%2Fjournal.pone.0003376 Baker WJ, Dodsworth S, Forest F, Graham SW, Johnson MG, McDonnell A et al. Exploring Angiosperms353: An open, community toolkit for collaborative phylogenomic research on flowering plants. American Journal of Botany. 2021;108(7):1059-1065. https://doi.org/10.1002/ajb2.1703 Banwart SA, Noellemeyer E & Milne E. Soil Carbon: Science, Management and Policy for Multiple Benefits. SCOPE Series, 71. CABI, Wallingford, UK; 2015. 123 http://dx.doi.org/10.1079/9781780645322.0000 Barker MS, Arigo N, Baniaga AE, Li Z & Levin DA. On the relative abundance of autopolyploids and allopolyploids. New Phytologist. 2016;210:391-398. https://doi.org/10.1111/nph.13698 Barkworth ME & Everett J. Evolution in the Stipeae: Identification and relationships of its monophyletic taxa. In: Soderstrom TR, Hilu KC & Barkworth M, editors. Grass Systematics and Evolution. Washington DC: Smithsonian Institution Press; 1987. p. 251-264. Barkworth ME. Stipa L. In: Barkworth ME, Capels KM, Long S, Anderton LK & Piep MB, editors. Flora of North America North of Mexico. New York: Oxford University Press; 2007. p. 154-156. Barkworth ME, Arriaga MO, Smith JF, Jacobs SW, Valdes-Reyna J & Bushman SB. Molecules and morphology in South American Stipeae (Poaceae). Systematic Botany. 2008;33:719-731. https://doi.org/10.1600/036364408786500235 Bello EB, Rasco JLS, Sendon PMD, Cueva FMD, Lalusin AG & Laurena AC. Genetic Diversity Analysis of Selected Sugarcane (Saccharum spp. Hybrids) Varieties Using DArT-Seq Technology. The Philippine journal of science. 2019;148:103-114. Bengtsson J, Bullock JM, Egoh B, Everson C, Everson T, O’Connor T et al. Grasslands—more important for ecosystem services than you might think. Ecosphere. 2019;10:e02582. https://doi.org/10.1002/ecs2.2582 Bor NL. Grasses of Burma, Ceylon, India and Pakistan (Excluding Bambuseae). London: Pergamon Press; 1960. Bor NL. Graminae. In: Rechinger KH, editor. Flora Iranica. Graz: Academische Druck-und Verlagsanstalt; 1970. p. 1-573. Borsch T, Stevens AD, Häffner E, Güntsch A, Berendsohn WG, Appelhans MS et al. A complete digitization of German herbaria is possible, sensible and should be started now. Research Ideas and Outcomes. 2020;6:e50675. https://doi.org/10.3897/rio.6.e50675 Brunetti G, Soler-Rovira P, Farrag K & Senesi N. Tolerance and accumulation of heavy metals by wild plant species grown in contaminated soils in Apulia region Southern Italy. Plant Soil. 2009;318:285-298. 124 https://doi.org/10.1007/s11104-008-9838-3 Brutnell T. Model grasses hold key to crop improvement. Nature Plants. 2015;1:15062. https://doi.org/10.1038/nplants.2015.62 Castellanos-Frías E, Garcia de Leon D, Bastida F & Gonzalez-Andujar JL. Predicting global geographical distribution of Lolium rigidum (rigid ryegrass) under climate change. The Journal of Agricultural Science. 2016;154(5):755-764. http://doi.org/10.1017/S0021859615000799 Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S & Bayley H. Continuous base identification for single-molecule nanopore DNA sequencing. Nature Nanotechnology. 2009;4:265-270. https://doi.org/10.1038/nnano.2009.12 Clarkson BD. A natural intergeneric hybrid, Celmisia gracilenta x Olearia arborescens (Compositae) from Mt Tarawera, New Zealand. New Zealand Journal of Botany. 1988; 26(2):325-331. https://doi.org/10.1080/0028825X.1988.10410122 Corney DPA, Clark JY, Tang HL & Wilkin P. Automatic extraction of leaf characters from herbarium specimens. Taxon. 2012;61:231-244. http://doi .org/10.1002/tax.611016 Couturon E, Lashermes P & Charrier A. First intergeneric hybrids (Psilanthus ebracteolatus Hiern x Coffea arabica L.) in coffee trees. Canadian Journal of Botany. 1998;76(3):542-546. https://doi.org/10.1139/b98-017 Danzhalova EV, Bazha SN, Gunin PD, Drobyshev YI, Kazantseva TI, Prischepa AV et al. Indicators of pasture digression in steppe ecosystems of Mongolia. Exploration into the Biological Resources of Mongolia. 2012;12:297-306. Davis CC, Champ J, Park DS, Breckheimer I, Lyra GM, Xie J et al. A new method for counting reproductive structures in digitized herbarium specimens using Mask R-CNN. Frontiers in Plant Science. 2020;11:1129. https://doi.org/10.3389/fpls.2020.01129 Dietrich L, Gotting-Martin E, Hertzog J, Schmitt-Kopplin P, McGovern PE, Hall GR et al. Investigating the function of Pre-Pottery Neolithic stone troughs from Gobekli Tepe - An integrated approach. Journal of Archaeological Science: Reports. 2020; 34:102618. https://doi.org/10.1016/j.jasrep.2020.102618 125 Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323:133-138. https://doi.org/10.1126/science.1162986 Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLOS One. 2011;6:e19379. https://doi.org/10.1371/journal.pone.0019379 Everett J & Jacobs S. Studies in Australian Stipa (Poaceae). Telopea. 1983;2:391-400. https://doi.org/10.7751/telopea19834405 Falster D, Gallagher R, Wenk EH, Wright IJ, Indiarto D, Andrew SC et al. AusTraits, a curated plant trait database for the Australian flora. Scientific Data. 2021;8:254. https://doi.org/10.1038/s41597-021-01006-6 Farashi A & Karimian Z. Assessing climate change risks to the geographical distribution of grass species. Plant Signaling and Behavior. 2021;16(7):1913311. https://doi.org/10.1080/15592324.2021.1913311 Freitag H. The genus Stipa (Gramineae) in southwest and South Asia. Notes from the Royal Botanic Garden, Edinburgh. 1985;42:355-489. Gao Sb, Mo Ld, Zhang Lh, Zhang Jl, Wu Jb, Wang Jl et al. Phenotypic plasticity vs. local adaptation in quantitative traits differences of Stipa grandis in semi-arid steppe, China. Scientific Reports. 2018;8:3148. https://doi.org/10.1038/s41598-018-21557-w GBIF.org. Accessed on 21 January 2022. Gonzalo R, Aedo C & García MA. Taxonomic revision of the Eurasian Stipa subsections Stipa and Tirsae (Poaceae). Systematic Botany. 2013;38:344-378. https://doi.org/10.1600/036364413X666615 Govaerts R, Lughadha EN, Black N, Turner R & Paton A. The World Checklist of Vascular Plants, a continuously updated resource for exploring global plant diversity. Scientific Data. 2021;8:215. https://doi.org/10.1038/s41597-021-00997-6 Hamasha HR, von Hagen KB & Roser M. Stipa (Poaceae) and allies in the Old World: molecular phylogenetics realigns genus circumscription and gives 126 evidence on the origin of American and Australian lineages. Plant Systematics and Evolution. 2012;298:351-367. https://doi.org/10.1007/s00606-011-0549-5 Harris KM & Marsico TD. Digitizing specimens in a small herbarium: A viable workflow for collections working with limited resources. Applications in Plant Sciences. 2017;5(4):1600125. https://doi.org/10.3732/apps.1600125 Henry AG, Brooks AS & Piperno DR. Microfossils in calculus demonstrate consumption of plants and cooked foods in Neanderthal diets (Shanidar III, Iraq; Spy I and II, Belgium). Proceedings of the National Academy of Sciences of the United States of America. 2011;108:486-491. https://doi.org/10.1073/pnas.1016868108 Hitchcock AS. The North American species of Stipa. The Contributions from the United States National Herbarium. 1925(A);24(7):215-262. Hitchcock AS. Synopsis of the South American species of Stipa. The Contributions from the United States National Herbarium. 1925(B);24(7):215-289. Hitchcock AS. Manual of the grasses of the United States. Washington DC: United States Department of Agriculture; 1951. Hu Y, Zuo X, Yue P, Zhao S, Guo X, Li X et al. Increased Precipitation Shapes Relationship between Biochemical and Functional Traits of Stipa glareosa in Grass-Dominated Rather than Shrub-Dominated Community in a Desert Steppe. Plants. 2020;9(11):1463. https://doi.org/10.3390/plants9111463 Hu L, Yang R, Wang YH & Gong X. The natural hybridization between species Ligularia nelumbifolia and Cremanthodium stenoglossum (Senecioneae, Asteraceae) suggests underdeveloped reproductive isolation and ambiguous intergeneric boundary. AoB PLANTS. 2021;13(2):plab012. https://doi.org/10.1093/aobpla/plab012 Hughes DK. A revision of the Australian species of Stipa. Kew bulletin. 1921; 1921:130. https://doi.org/10.2307/4118194 Hughes DK. Further notes on the Australian species of Stipa. Kew bulletin. 1922; 1922: 1522. https://doi.org/10.2307/4118618 127 Jacobs BF, Kingston JD & Jacobs LL. The origin of grass-dominated ecosystems. Annals of the Missouri Botanical Garden. 1999; 86:590-643. https://doi.org/10.2307/2666186 Jenczewski E & Alix K. From Diploids to Allopolyploids: The Emergence of Efficient Pairing Control Genes in Plants. Critical Reviews in Plant Sciences. 2004;23(1):21-45. https://doi.org/10.1080/07352680490273239 Johnson MG, Pokorny L, Dodsworth S, Botigue LR, Cowan RS, Devault A et al. A Universal Probe Set for Targeted Sequencing of 353 Nuclear Genes from Any Flowering Plant Designed Using k-Medoids Clustering. Systematic Biology. 2019;68(4):594-606. https://doi.org/10.1093/sysbio/syy086 Karpechenko GD. Polyploid hybrids of Raphanus sativus X Brassica oleracea L. Bulletin of applied botany, of genetics and plant-breeding. 1927;17:305-408. Kattge J, Bonisch G, Diaz S, Lavorel S, Prentice IC, Leadley P et al. TRY plant trait database - enhanced coverage and open access. Global Change Biology. 2020;26(1):119-188. https://doi.org/10.1111/gcb.14904 Kellogg EA. Subfamily Pooideae. In: Kubitzki K, editor. The families and genera of vascular plants. Berlin: Springer; 2015. p. 199-229. https://doi.org/10.1007/978-3-319-15332-2 Kilian A, Wenzl P, Huttner E, Carling J, Xia L, Blois H et al. Diversity arrays technology: a generic genome profiling technology on open platforms. Methods in Molecular Biology. 2012;888:67-89. https://doi.org/10.1007/978-1-61779-870-2 5 Klokov M & Osychnyuk V. Stipae Ucrainicae. Novosti Sistematiki Vysshik i nizshikh rastenii. 1976;1975:7-91. Knobloch IW. Intergeneric hybridization in flowering plants. Taxon. 1972;21:97-103. https://doi.org/10.2307/1219229 Kotukhov YA. Konspekt kovylei (Stipa L.) i kovylechkov (Ptilagrostis Griseb.) vostochnogo Kazakhstana (Kazakhstanskii Altai, Zaisanskaya kotlovina i Prialtaiskie khrebty). Botaniceskie issledovanija Sibiri i Kazahstana. 2002;8:3-16. 128 Krawczyk K, Nobis M, Nowak A, Szczecińska M & Sawicki J. Phylogenetic implications of nuclear rRNA IGS variation in Stipa L. (Poaceae). Scientific Reports. 2017;7:11506. https://doi.org/10.1038/s41598-017-11804-x Krawczyk K, Nobis M, Myszczyński K, Klichowska E & Sawicki J. Plastid superbarcodes as a tool for species discrimination in feather grasses (Poaceae: Stipa). Scientific Reports. 2018;8:1924. https://doi.org/10.1038/s41598-018-20399-w Kuo PC & Sun YH. Stipa Linn. In: Kuo PC, editor. Flora Reipublicae Popularis Sinicae. Beijing: Science Press; 1987. p. 268-287. Lee-Thorp J, Likius A, Mackaye HT, Vignaud P, Sponheimer M & Brunet M. Isotopic evidence for an early shift to Ct resources by Pliocene hominins in Chad. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(50):20369-20372. https://doi.org/10.1073/pnas.1204209109 Linnaeus C. Species plantarum: exhibentes plantas rite cognitas, ad genera relatas, cum differentiis specificis, nominibus trivialibus, synonymis selectis, locis natalibus, secundum systema sexuale digestas. Stockholm: Laurentius Salvius; 1753. https://doi.org/10.5281/zenodo.3931989 Lorieul T, Pearson KD, Ellwood ER, Goeau H, Molino J-F & Sweeney PW et al. Toward a large-scale and deep phenological stage annotation of herbarium specimens: Case studies from temperate, tropical, and equatorial floras. Applications in Plant Sciences. 2019;7:e01233. https://doi.org/10.1002/aps3.1233 Lv X, Zhou G, Wang Y & Song X. Sensitive Indicators of Zonal Stipa Species to Changing Temperature and Precipitation in Inner Mongolia Grassland, China. Frontiers in Plant Science. 2016;7:73. https://doi.org/10.3389/fpls.2016.00073 Lv X, He Q & Zhou G. Contrasting responses of steppe Stipa ssp to warming and precipitation variability. Ecology and Evolution. 2019;9:9061-9075. https://doi.org/10.1002/ece3.5452 Maire R. Stipa. In: Maire R, editor. Flore de 1’ Afrique du Word. Paris: Le Chevalier; 1953. p. 61-81. 129 Martinovsky JO. Zwei neue südeuropäische Federgrassippen IX. Beitrag zur Kanntnis der europäischen Stipa-Sippen Feddes repertorium: Zeitschrift für botanische Taxonomie und Geobotanik. 1966;73:141-152. Martinovsky JO. Neue submediterrane Stipa-Arten und die taxonomische Einteilung der Federgrassippen der Serie Pulcherrimae Martinovsky. Preslia. 1967;39:260-275. Martinovsky JO. Über drei neue Stipa Sippen aaus dem Verwandtschaftskreisb Stipa joannis s. l. XXII. Beitrag zur Kenntnis der Stipa-Sippen. Oesterreichische Botanische Zeitschrift: Gemeinnütziges Organ für Botanik und Botaniker, Gärtner, Oekonomen, Forstmänner, Aerzte, Apotheker und Techniker. 1970;118:171-181. Martinovsky JO. Neue Stipa-Sippen und einige Ergänzungen der früher beschriebenen Stipa-taxa. Preslia. 1976;48:186-188. Martinovsky JO. Stipa. In: Tutin TG, editor. Flora Europaea. Cambridge: Cambridge University Press; 1980. p. 247-252. Matsuoka Y, Takumi S & Nasuda S. Genetic mechanisms of allopolyploid speciation through hybrid genome doubling: novel insights from wheat (Triticum and Aegilops) studies. International Review of Cell and Molecular Biology. 2014;309:199-258. https://doi.org/10.1016/b978-0-12-800255-1.00004-1 McAllister CA, McKain MR, Li M, Bookout B & Kellogg EA. Specimen-based analysis of morphology and the environment in ecologically dominant grasses: the power of the herbarium. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences. 2019;374:20170403. https://doi.org/10.1098/rstb.2017.0403 Moameri M, Jafari M, Tavili A, Motasharezadeh B, Zare Chahouki MA & Diaz FM. Investigating lead and zinc uptake and accumulation by Stipa hohenackeriana Trin and Rupr in field and pot experiments. Bioscience Journal. 2018;34:138-150. https://doi.org/10.14393/BJ-v34n1a2018-37238 Moraldo B. Il genera Stipa L. (Gramineae) in Italia. Webbia. 1986;40:203-278. https://doi.org/10.1080/00837792.1986.10670388 Nie B, Jiao BH, Ren LF, Gudkova PD, Chen WL & Zhang WH. Integrative taxonomy recognized a new cryptic species within Stipa grandis from loess plateau of China. Journal of Systematics and Evolution. 2020. https://doi.org/10.1111/jse.12714 130 Nobis M. Feather grasses (Stipa L.) of the Pamir Alai Mts. (Middle Asia): An outline for further studies. In: Frey L, editor. Grass Research. Krakow: Polish Academy of Sciences; 2009. p. 7-15. Nobis M. Taxonomic revision of the Stipa lipskyi group (Poaceae: Stipa section Smirnovia) in the Pamir alai and tian-Shan Mountains. Plant Systematics and Evolution. 2013;299:1307-1354. https://doi.org/10.1007/s00606-013-0799-5 Nobis M. Taxonomic revision of the central Asiatic Stipa tianschanica complex (Poaceae) with particular reference to the epidermal micromorphology of the lemma. Folia Geobotanica. 2014;49:283-308. https://doi.org/10.1007/s 12224-013-9164-2 Nobis M & Gudkova PD. Taxonomic notes on feather grasses (Poaceae: Stipa) from eastern Kazakhstan with typification of seven names and one new combination. Phytotaxa. 2016;245:31-42. https://doi.org/10.11646/phytotaxa.245.1.3 Nobis M, Klichowska E, Nowak A, Gudkova PD & Rola K. Multivariate morphometric analysis of the Stipa turkestanica group (Poaceae). Plant Systematics and Evolution. 2016;302:137-153. https://doi.org/10.1007/s00606-015-1243-9 Nobis M, Nowak A, Nobis A, Nowak S, Zabicka J & Zabicki P. Stipa xfallax (Poaceae: Pooideae: Stipeae), a new natural hybrid from Tajikistan, and a new combination in Stipa drobovii. Phytotaxa. 2017;303:141-154. https://doi.org/10.11646/phytotaxa.303.2.4 Nobis M, Gudkova PD, Baiakhmetov E, Zabicka J, Krawczyk K & Sawicki J. Hybridisation, introgression events and cryptic speciation in Stipa (Poaceae): a case study of the Stipa heptapotamica hybrid-complex. Perspectives in Plant Ecology, Evolution and Systematics. 2019;39:125457. https://doi.org/10.1016/j.ppees.2019.05.001 Nobis M, Gudkova P, Nowak A, Sawicki J & Nobis A. A revision of the genus Stipa (Poaceae) in middle Asia, including a key to species identification, an annotated checklist and phytogeographic analysis. Annals of the Missouri Botanical Garden. 2020;105:1-63. https://doi.org/10.3417/2019378 131 OECD/FAO. OECD-FAO Agricultural Outlook 2021-2030. Paris: OECD Publishing; 2021. https://doi.org/10.1787/19428846-en Ogasawara O, Kodama Y, Mashima J, Kosuge T & Fujisawa T. DDBJ Database updates and computational infrastructure enhancement. Nucleic Acids Research. 2020;48:45-50. https://doi.org/10.1093/nar/gkz982 Peterson PM, Romaschenko K, Soreng RJ &Valdes Reyna J. A key to the North American genera of Stipeae (Poaceae, Pooideae) with descriptions and taxonomic names for species of Eriocoma, Neotrinia, Oloptum, and five new genera: Barkworthia, xEriosella, Pseudoeriocoma, Ptilagrostiella, and Thorneochloa. PhytoKeys. 2019;126:89-125. https://doi.org/10.3897/phytokeys.126.34096 Powell C, Krakowiak A, Fuller R, Rylander E, Gillespie E, Krosnick, S et al. Estimating herbarium specimen digitization rates: Accounting for human experience. Applications in plant sciences. 2021;9(4):e11415. https://doi.org/10.1002/aps3.11415 Reeb C, Kaandorp J, Jansson F, Puillandre N, Dubuisson J-Y, Cornette R et al. Quantification of complex modular architecture in plants. New Phytologist. 2018;218:859-872. https://doi.org/10.1111/nph.15045 Rocchetti GA, Armstrong CG, Abeli T, Orsenigo S, Jasper C, Joly, S et al. Reversing extinction trends: New uses of (old) herbarium specimens to accelerate conservation action on threatened species. New Phytologist. 2021;230:433-450. https://doi.org/10.1111/nph.17133 Rodman PS & McHenry HM. Bioenergetics and the origin of hominid bipedalism. American Journal of Physical Anthropology. 1980;52:103-106. https://doi.org/10.1002/ajpa.1330520113 Romaschenko K, Peterson PM, Soreng RJ, Garcia-Jacas N, Futorna O & Susanna A. Systematics and evolution of the needle grasses (Poaceae: Pooideae: Stipeae) based on analysis of multiple chloroplast loci, ITS, and lemma micromorphology. Taxon. 2012;61:18-44. https://doi.org/10.1002/tax.611002 132 Roshevitz RY. Zlaki (Gramineae). In: Fedtschenko BA, editor. Flora Aziatskoi Rossii. Petrograd: Pereselencheskoe Upravlenie Ministerstva Zemledeliya; 1916. p. 107191. Roshevitz RY. Stipa novae Asiae centralis. Botanicheskie Materialy Gerbariya Glavnogo Botanicheskogo Sada RSFSR. 1920;1:1-4. Roshevitz RY. Novye zlaki Zabaikalya. Izvestiya Glavnogo Botaniceskago Sada SSSR. 1929;28:379-380. Roshevitz RY. Stipa L. In: Komarov VL, editor. Flora SSSR. Leningrad: Academy of Sciences of the Soviet Union; 1934. Sayers EW, Bolton EE, Brister JR, Canese K, Chan J, Comeau DC et al. Database resources of the national center for biotechnology information. Nucleic Acids Research. 2022;50(1):20-26. https://doi.org/10.1093/nar/gkab1112 Scholthof KBG, Irigoyen S, Catalan P & Mandadi KK. Brachypodium: A Monocot Grass Model Genus for Plant Biology. The Plant Cell. 2018;30(8):1673-1694. https://doi.org/10.1105/tpc.18.00083 Scholz H. Stipa L. In: Davis PH, Mill RR & Tan K, editors. Flora of Turkey and the East Aegean Islands. Edinburgh: University Press; 1985. p. 541-553. Scholz H. Neue Taxa der Gattung Stipa sect. Stipa (Gramineae) aus dem Mittelmeergebiet. Willdenowia. 1989;19:127-132. Scholz H. Stipa tunetana, eine neue Art aus Tunesien und das Stipa lagascae-Aggregat. Willdenowia. 1991;20:77-80. Schubert M, Marcussen T, Meseguer AS & Fjellheim S. The grass subfamily Pooideae: Cretaceous-Palaeocene origin and climate-driven Cenozoic diversification. Global Ecology and Biogeography. 2019;28:1168-1182. https://doi.org/10.1111/geb.12923 Scrivanti LR & Anton AM. Spatial distribution of Poa scaberula (Poaceae) along the Andes. Heliyon. 2020;6(10):e05220. https://doi.org/10.1016/j.heliyon.2020.e05220 Sheidai M, Attaei S & Khosravi-Reineh M. Cytology of some Iranian Stipa (Poaceae) species and populations. Acta Botanica Croatica. 2006;65(1):1-11. Simko I, Eujayl I & van Hintum TJ. Empirical evaluation of DArT, SNP, and SSR marker-systems for genotyping, clustering, and assigning sugar beet hybrid varieties into populations. Plant Science. 2012;184:54-62. 133 https://doi.Org/10.1016/i.plantsci.2011.12.009 Smarda, P, Knápek O, Brezinová A, Horová L, Grulich V, Danihelka J et al. Genome sizes and genomic guanine + cytosine (GC) contents of the Czech vascular flora with new estimates for 1700 species. Preslia. 2019;91:117-142. https://doi.org/10.23855/preslia.2019.117 Smirnov PA. Die neuen russischen Stipa-Pennata-Arten. Repertorium specierum novarum regni vegetabilis. 1925;21:223-233. https://doi.org/10.1002/fedr.19250210806 Smirnov PA. Cem XVII. Gramineae Juss. Trudy Glavnogo Botanicheskogo Sada. Acta Horti Petropolitani. 1928;40:115. Smirnov PA. Kluch k opredeleniu kovylei SSSR. Uchenye zapiski. Moskovskii gosudarstvennyi universitet. 1934;2:331-338. Smirnov PA. Stiparum Armeniae minus cognitarum descriptiones. Byulleten' Moskovskogo Obshchestva Ispytatelei Prirody. Otdel Biologicheskii. 1970;75:113-135. Smith MW, Gultzow DL & Newman TK. First Fruiting Intergeneric Hybrids between Citrus and Citropsis. Journal of the American Society for Horticultural Science. 2013;138(1):57-63. https://doi.org/10.21273/JASHS.138.L57 Soltis PS. Digitization of herbaria enables novel research. American Journal of Botany. 2017;104(9):1281-1284. https://doi.org/10.3732/aib.1700281 Song X, Zhou G, Xu Z, Lv X & Wang Y. A self-photoprotection mechanism helps Stipa baicalensis adapt to future climate change. Scientific Reports. 2016;6:25839. https://doi.org/10.1038/srep25839 Spegazzini C. Stipeae platenses. Anales del Museo Nacional de Montevideo. 1901;19:1-56. https://doi.org/10.5962/bhl.title.15447 Spegazzini C. Stipeae platenses novae v. criticae. Revista Arxentina de Botánica. 1925;1:9-51. Sponheimer M & Lee-Thorp JA. Isotopic Evidence for the Diet of an Early Hominid, Australopithecus africanus. Science. 1999;283(5400):368-370. https://doi.org/10.1126/science.283.5400.368 134 Straub SCK, Parks M, Weitemier K, Fishbein M, Cronn RC & Liston A. Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics. American Journal of Botany. 2012;99:349-364. https://doi.org/10.3732/ajb.1100335 Strid A. Stipa L. In: Strid A & Tan K, editors. Mountain flora of Greece vol. 2. Edinburgh: University Press; 1991. p. 825-830. Tav^anoglu C & Pausas J. A functional trait database for Mediterranean Basin plants. Scientific Data. 2018;5:180135. https://doi.org/10.1038/sdata.2018.135 Tegelberg R, Mononen T & Saarenmaa H. High-performance digitization of natural history collections: Automated imaging lines for herbarium and insect specimens. Taxon. 2014;63:1307-1313. https://doi.org/10.12705/636.13 Tkach N, Nobis M, Schneider J, Becher H, Winterfeld G, Jacobs SWL et al. Molecular Phylogenetics and Micromorphology of Australasian Stipeae (Poaceae, Subfamily Pooideae), and the Interrelation of Whole-Genome Duplication and Evolutionary Radiations in This Grass Tribe. Frontiers in Plant Science. 2021;11:630788. https://doi.org/10.3389/fpls.2020.630788 Trinius CB. Stipa L. In: von Ledebur KF, editor. Flora Altaica vol. 1. Berlin: Typis et impensis G. Reimeri; 1829. p. 80-84. Trinius CB. Graminum genera quaedam species que complures descriptionibus illustrata. Memoirs of the Imperial Academy of Sciences in St. Petersburg. 1831; 1:74. Trinius CB & Ruprecht FJ. Species Graminum Stipaceorum. St. Petersburg: Typis Academiae Imperialis Scientiarum; 1842. Tzvelev NN. Zlaki (Gramineae). In: Grubov VI, editor. Rastieniya Centralnoi Azii. Po Materialam Botanicheskogo Instituta im. VL Komarova (Plantae Asiae Centralis, Secus Materies Instituti Botanici Nomine VL Komarovii). Leningrad: Nauka; 1968. Tzvelev NN. Notulae de tribu Stipeae Dum. (fam. Poaceae) in URSS. Novosti Sistematiki Vyssih Rastenij. 1974; 11:4-20. Tzvelev NN. Grasses of the Soviet Union. Leningrad: Nauka; 1976. 135 Tzvelev NN. Notes on the tribe Stipeae Dumort. (Poaceae). Novosti Sistematiki Vysshikh Rastenii. 2012;43:20-9. Tzvelev NN. Some notes on the grasses (Poaceae) of the Caucasus. Botanicheskii Zhurnal. 1993;78:83-95. Vallejo-Marín M & Hiscock SJ. Hybridization and hybrid speciation under global change. New Phytologist. 2016;211(4):1170-87. https://doi.org/10.1111/nph.14004 Vázquez FM & Devesa JA. Revisión del género Stipa L. y Nassella Desv. (Poaceae) en la Península Ibérica e Islas Baleares. Acta Batánica Malacitana. 1996;21:125-189. https://doi.org/10.24310/abm.v21i0.8674 Vázquez FM & Devesa JA. Two new species and combinations of Stipa L. (Graminae) from northwest Africa. Botanical Journal of the Linnean Society. 1997;124(2):201-209. https://doi.org/10.1111/i.1095-8339.1997.tb01790.x Vázquez FM & Gutiérrez M. Classification of species of Stipa with awns having plumose distal segments. Telopea. 2011;13:155-176. https://doi.org/10.7751/telopea20116012 Vickery JW. Contributions to the taxonomy of Australian grasses. Contributions from the New South Wales National Herbarium. 1951; 1 (6):322-345. Vickery JW, Jacobs SWL & Everett J. Taxonomic studies in Stipa (Poaceae) in Australia. Telopea. 1986;3:1-132. https://doi.org/10.7751/TELOPEA19864701 Weaver WN, Ng J & Laport RG. LeafMachine: Using machine learning to automate leaf trait extraction from digitized herbarium specimens. Applications in Plant Sciences. 2020;8(6):e11367. https://doi.org/10.1002/aps3.11367 Wieczorek J, Bloom D, Guralnick R, Blum S, Doering M, Giovanni R et al. Darwin Core: An evolving community developed biodiversity data standard. PLOS One. 2012;7:e29715. https://doi .org/10.1371/iournal.pone.0029715 Wu ZL & Phillips SM. Tribe Stipeae. In: Wu ZY, Raven PH & Hong DY, editors. Flora of China vol. 22 (Poaceae). Beijing: Science Press; 2006. p. 188-212. 136 Wu JB, Chen CB, Bao XY, Song WQ, Zhao NX & Gao YB. Chromosome Numbers and Karyotypes of Stipa baicalensis, Stipa grandis and Stipa krylovii in Inner-mongolia Steppe. Bulletin of Botanical Research. 2009;29(5):534-538. Yakimowski SB & Rieseberg LH. The role of homoploid hybridization in evolution: a century of studies synthesizing genetics and ecology. American Journal of Botany. 2014;101:1247-1259. https://doi.org/10.3732/ajb.1400201 Yanez A, Desta I, Commins P, Magzoub M & Naumov P. Morphokinematics of the Hygroactuation of Feather Grass Awns. Advanced Biosystems. 2018;2:1800007. https://doi.org/10.1002/adbi.201800007 Yang Y, Li X, Kong X, Ma L, Hu X & Yang Y. Transcriptome analysis reveals diversified adaptation of Stipa purpurea along a drought gradient on the Tibetan Plateau. Functional & Integrative Genomics. 2015;15:295-307. https://doi.org/10.1007/s 10142-014-0419-7 Zhang X, Fan B, Yu Z, Nie L, Zhao Y, Yu X et al. Functional Analysis of Three miRNAs in Agropyron mongolicum Keng under Drought Stress. Agronomy. 2019;9:661. https://doi .org/10.3390/agronomy9100661 Zhao H, Guo K, Yang Y, Liu C, Zhao L, Qiao X et al. Stipa steppes in scantily explored regions of the Tibetan Plateau: classification, community characteristics and climatic distribution patterns. Journal of Plant Ecology. 2018;11(4):585-594. https://doi.org/10.1093/jpe/rtx029 Zhu YJ, Qiao XG, Guo K, Xu R & Zhao LQ. Distribution, community characteristics, and classification of Stipa tianschanica var. gobica steppe in China. Chinese Journal of Plant Ecology. 2018;7:785-792. https://doi.org/10.17521/cjpe.2017.0314 Zietkiewicz E, Rafalski A & Labuda D. Genome fingerprinting by simple sequence repeat (SSR)-anchored polymerase chain reaction amplification. Genomics. 1994;20:176-183. https://doi .org/10.1006/geno.1994.1151 137