Integrated phylogenomic analyses unveil reticulate evolution in Parthenocissus (Vitaceae), highlighting speciation dynamics in the Himalayan – Hengduan Mountains

lineage in the Himalayan – Heng-duan Mountains (HHM) region showing perplexing phylogenetic relationships, provides an opportunity for investigating speciation dynamics based on integrated evidence. (cid:1) We investigated phylogenetic discordance and reticulate evolution in Parthenocissus based on rigorous analyses of plastome and transcriptome data. We focused on reticulations in the trifoliolate lineage in the HHM region using a population-level genome resequencing dataset, incorporating evidence from morphology, distribution, and elevation. (cid:1) Comprehensive analyses conﬁrmed multiple introgressions within Parthenocissus in a robust temporal – spatial framework. Around the HHM region, at least three hybridization hot spots were identiﬁed, one of which showed evidence of ongoing speciation reversal. (cid:1) We present a solid case study using an integrative methodological approach to investigate reticulate evolutionary history and its underlying mechanisms in plants. It demonstrates an example of speciation reversal through frequent hybridizations in the HHM region, which provides new perspectives on speciation dynamics in mountainous areas with strong topographic and environmental heterogeneity.


Introduction
Hybridization has been recognized as an important process during biological evolution and is particularly common in plants, creating reticulate structures in the Tree of Life (Rieseberg & Wendel, 1993;Soltis & Soltis, 2009;Abbott et al., 2013;Payseur & Rieseberg, 2016).Interspecific hybridization can lead to either speciation or speciation reversal (Abbott et al., 2013;Wu et al., 2022).If reproductive isolation is newly established, speciation may occur through allopolyploidization or homoploid hybrid speciation and thus promoting biodiversity (Abbott et al., 2013;Wu et al., 2022).However, when reproductive isolation is weak or eroded by secondary contact, parental lineages can be replaced by proliferated hybrids, leading to genetic swamping and eventually, speciation reversal (Seehausen, 2006;Seehausen et al., 2008;Todesco et al., 2016;Zhang et al., 2019).Differing from most introgressions that typically form narrow hybrid zones, speciation reversal usually results in hybrid lineages with mosaic genomes replacing their distinct parental lineages and is recognized as a cause of biodiversity decline (Seehausen, 2006;Seehausen et al., 2008;Todesco et al., 2016).Previous studies suggested speciation reversal can be induced by fluctuating environments (Gilman & Behm, 2011;Zhang et al., 2019) and is more likely to occur in regions with complex topography where the effects of climatic and elevational shifts are accentuated.
The Himalayan-Hengduan Mountains (HHM) region is wellknown as a natural laboratory for investigating the processes of rapid allopatric and ecological speciation (Wen et al., 2014;Favre et al., 2015;Meng et al., 2017;Manish & Pandit, 2018;Niu et al., 2018).Continuous geological activities and climatic oscillations in this region have facilitated secondary contact and increased the chances of hybridization (Liu et al., 2013;Wen et al., 2014;Jiang et al., 2018;Dong et al., 2021).While numerous examples have shown evidence for speciation and even rapid diversification through hybridization in the HHM (Gao et al., 2012;Ru et al., 2018;J. L. Li et al., 2020), few cases of speciation reversal have been reported (Wu et al., 2022).Lineages in the HHM region showing complex diversification processes driven by antagonistic forces are ideal to investigate the role of geographic and climatic changes in reticulate evolution and can enhance our understanding of speciation dynamics in areas with strong topographic heterogeneity.
Parthenocissus Planch. is a monophyletic genus in the grape family (Vitaceae) with nine recognized species from Asia (Chen et al., 2007) and three from North America (Moore & Wen, 2016), reflecting the well-known Asian-North American disjunct distribution pattern (Fig. S1) (Chen & Manchester, 2007;Wen et al., 2007;Nie et al., 2010;Lu et al., 2012Lu et al., , 2018;;Hu et al., 2022).Three lineages within the Asian clade are recognized, namely the henryana, dalzielii, and trifoliolate lineages (Fig. S1a).However, relationships among and/or within the three lineages and the North American clade have confounded researchers due to cytonuclear discordance, especially in the trifoliolate lineage (Lu et al., 2012;Ma et al., 2021).The four species of the trifoliolate lineage (P.chinensis, P. feddei, P. heterophylla, and P. semicordata) co-occur within the HHM region, occupying an altitudinal range from 500 to 3800 m (Chen et al., 2007), with P. heterophylla extending to India and Indonesia (Fig. S1b).Species in the trifoliolate lineage are morphologically similar, particularly P. semicordata and P. chinensis.Parthenocissus chinensis was once recognized as a narrow endemic to a dry habitat area in the upper Yangtze River valley, but molecular evidence suggests more widely distributed individuals formerly identified as P. semicordata form a strongly supported clade with P. chinensis (Nie et al., 2010).No consistent morphological characters can distinguish individuals of the P. chinensis clade from those of the P. semicordata clade, with only minor size differences in leaflets and inflorescences (Chen et al., 2007;Lu et al., 2012).Despite their morphological similarity, phylogenetic analyses have revealed that P. semicordata and P. chinensis are well-diverged lineages, resolving P. semicordata as sister to P. heterophylla, rather than to P. chinensis (Nie et al., 2010).However, P. heterophylla is a distinct species based on morphological, geographical, and molecular evidence, differing from P. chinensis and P. semicordata in possessing larger inflorescences and more obvious appendages on the inner side of petals (Lu et al., 2012).We therefore suggest that gene flow, and potentially ongoing speciation reversal, may exist between P. semicordata and P. chinensis.The plastid capture hypothesis should also be considered; that is, the similarity between P. chinensis and P. semicordata may result from their recent divergence, and multiple plastid capture events might have caused cytonuclear discordance in the trifoliolate lineage.These alternative scenarios can be tested by investigating the reticulate evolutionary history of the trifoliolate lineage.
Based on phylogenetic signals left by past hybridization, it is possible to restore complex reticulate evolutionary histories involving speciation and speciation reversal.However, it remains challenging to reconstruct a robust reticulate evolutionary history of lineages enduring frequent hybridizations and rapid diversification.Besides hybridization, gene tree estimation error and incomplete lineage sorting (ILS) also can cause phylogenetic discordance (Morales-Briones et al., 2018;Cai et al., 2021).Gene tree estimation error can be quantified by multiple approaches (Budenhagen et al., 2016;Arcila et al., 2017;Cai et al., 2021).However, species network inference methods that tease apart the interplay between ILS and hybridization (e.g.Mirarab et al., 2014;Yu et al., 2014;Yu & Nakhleh, 2015;Sol ıs-Lemus et al., 2017) sometimes generate inaccurate results due to limited numbers of taxa and/or complex reticulation events (Hejase & Liu, 2016;Blair & An e, 2020).In such cases, rigorous statistical analyses, such as the ABBA-BABA test (Green et al., 2010), can help determine gene flow (e.g. Rose et al., 2021).Recently, multiple empirical studies of animals have illustrated that comprehensive approaches integrating population-level analyses, morphology, and distribution are effective in reliably elucidating reticulations (e.g.Andersen et al., 2021;Ferreira et al., 2021;Pav on-V azquez et al., 2021).However, such integrative studies of reticulate evolution in plants are rare, even though plants tend to show more complex evolutionary histories than animals due to frequent hybridizations (Barraclough & Nee, 2001;Linder & Rieseberg, 2004;Soltis & Soltis, 2009).Therefore, additional studies of plant lineages, applying a comprehensive analytical approach, are required to build robust phylogenetic networks revealing complex reticulate evolution.
In this study, we use Parthenocissus as an exemplar taxon to test how geographic and climatic changes facilitated reticulate evolution with a comprehensive set of methodological approaches combining phylogenetic reconstruction, phylogenetic conflict assessment, gene flow inference, biogeographic reconstruction, and population structure analysis.We focus on demonstrating a reticulate evolutionary history of Parthenocissus within a robust temporal-spatial framework by employing transcriptome orthologs and complete plastomes of all 12 Parthenocissus species.We then investigate speciation dynamics and underlying mechanisms in the HHM region using a population-level genome resequencing dataset including 212 individuals from 37 populations of the trifoliolate lineage of Parthenocissus.We specifically test the following two conditions to support potential speciation reversal in P. chinensis and P. semicordata: (1) P. chinensis and P. semicordata are deeply diverged lineages and (2) hybrids with admixed genomes have replaced or are replacing their parents.We also demonstrate that plastid capture cannot fully explain cytonuclear discordance in the trifoliolate lineage.

Taxon sampling
Taxon sampling was designed to resolve phylogenetic relationships among major lineages of Parthenocissus with plastome and transcriptome data, and investigate the evolutionary history of the trifoliolate lineage with genome resequencing data.For plastome sequencing, we sampled 14 individuals representing 12 species of Parthenocissus and one sister taxon (Yua austroorientalis (F.P. Metcalf) C.L. Li) as outgroup (Table S1).The same individuals of Parthenocissus were used for transcriptome sequencing but with Y. thomsonii (M.A. Lawson) C.L. Li as outgroup (Table S1) as the attempts at transcriptome sequencing of Y. austro-orientalis failed.To make use of reliable calibration points in Vitaceae, we included 27 plastomes representing other clades of Vitaceae to generate a family-level dataset including 40 plastomes (hereinafter referred to as the 40-plastome dataset) and downloaded transcriptomes of two Vitis species (Table S2).
We collected a total of 212 individuals from 37 populations of the trifoliolate lineage throughout the geographic distribution ranges of P. chinensis C.L. Li, P. feddei (H.L ev.) C.L. Li, P. heterophylla (Blume) Merr., and P. semicordata (Wall.)Planch.(Table S3; populations of the four species were marked as CH, FE, HE, and SE, respectively).Voucher specimens were deposited at the Herbarium of Institute of Botany, Chinese Academy of Sciences (PE; herbaria codes follow Thiers, 2017).

Plastome sequencing, assembly, and annotation
Library construction and sequencing were based on DNA extracted from silica gel-dried leaves.Paired-end reads of 150 bp for all samples were generated in a single lane on an Illumina HiSeq2500 sequencer.For each newly sequenced sample (including samples representing species of Parthenocissus and Yua and the population samples), we filtered raw bases using TRIMMO-MATIC 0.3.2(Bolger et al., 2014) with the following settings: SLI-DINGWINDOW:4:20 LEADING:10 TRAILING:10 MINLEN:70.We assembled plastomes with filtered data using both de novo assembly method and a reference genome of Vitis vinifera L. (NC007957; Jansen et al., 2006).We mapped quality-controlled reads against the reference genome in GEN-EIOUS 9.1.4(Kearse et al., 2012) with medium-low sensitivity using five iterations.Reads were exported and plastomes were de novo assembled using VELVET 7.0.4(Zerbino & Birney, 2008) with auto-adjusted coverage cutoffs and k-mer sizes from 97 to 145.Final plastome scaffolds were manually adjusted to eliminate errors and ambiguities.We annotated plastomes of Parthenocissus and Yua using DOGMA (Wyman et al., 2004) and GENEIOUS, with the plastome of V. vinifera as a reference.

Transcriptome sequencing, assembly, and ortholog inference
Fresh leaves of 13 Parthenocissus individuals whose plastomes were sequenced and one individual of Yua thomsonii were used for RNA extraction, library construction, and sequencing (individuals used are shown in Table S1).For all samples, 150 bp paired-end reads were generated in a single lane on an Illumina HiSeq2500 sequencer.The raw bases were filtered using TRIMMO-MATIC with the same settings above.Transcriptome for each sample was de novo assembled using TRINITY 2.5.1 (Grabherr et al., 2011;Haas et al., 2013).We used TRANSDECODER 5.0.0 (Haas et al., 2013) to predict peptide sequences longer than 100 amino acids with an open read frame.Nonredundant, representative sequences were retained by CD-HIT 4.6.5 (Li & Godzik, 2006) with a threshold value of 0.98.Finally, 2587 singlecopy orthologs of the 13 species (the individual of P. heterophylla from India was excluded to keep one individual per species) were identified with ORTHOFINDER 2.2.6 (Emms & Kelly, 2015) based on the output of CD-HIT (hereinafter referred to as 13taxa-2587nu dataset).
A phylogenetic tree based on the 40-plastome dataset was constructed using IQ-TREE 2.1.2(Nguyen et al., 2015) with 5000 ultrafast bootstrap replicates (Hoang et al., 2018) under edge-unlinked partitioned model (Chernomor et al., 2016), where the best model scheme was selected by MODELFINDER (Kalyaanamoorthy et al., 2017).Both concatenation and coalescent-based methods were used to reconstruct phylogenetic relationships of Parthenocissus for the nuclear data.We used IQ-TREE to construct the phylogeny of Parthenocissus based on the concatenated 13taxa-2587nu dataset with 5000 ultrafast bootstrap replicates, and the best model was selected by MODELFIN-DER.We constructed gene trees of each ortholog with maximum likelihood (ML), Bayesian inference (BI), and maximum parsimony (MP) approaches.Maximum likelihood gene trees were constructed using IQ-TREE with the best model selected by MODELFINDER and 100 bootstrap replicates.Bayesian inference gene trees were constructed by MRBAYES 3.2.7a(Ronquist et al., 2012), and the best model for each gene was selected by MRMODELTEST2 (Nylander, 2004) using the AIC metric.Maximum parsimony gene trees were constructed by PAUP* v.4.0a163 (Swofford, 2002) with 1000 bootstrap replicates using heuristic search.We then generated multispecies coalescent (MSC) trees in ASTRAL 5.6.3(Mirarab et al., 2014) with the ML, BI, and MP gene trees, respectively, as input.As gene trees with poorly Ó 2022 The Authors New Phytologist Ó 2022 New Phytologist Foundation.This article has been partially contributed to by US Government employees and their work is in the public domain in the USA.
New Phytologist (2022) www.newphytologist.comsupported clades can mislead MSC tree-based methods (Zhang et al., 2018), we collapsed nodes of gene tree with < 10%, 50%, or 75% bootstrap support (BS) to determine the best cutoff for ASTRAL species tree inference.We also used SVDQUARTETS (Chifman & Kubatko, 2014) implemented in PAUP* to reconstruct phylogeny of Parthenocissus with the concatenated 13taxa-2587nu alignment.All possible quartets were evaluated, and trees were selected using the QUARTET FM (QFM) method with 1000 bootstrap replicates.

Phylogenetic conflict assessment
We used PHYPARTS (Smith et al., 2015) to calculate the number of conflicting and concordant bipartitions and the 'internode certainty all' (ICA) values representing the degree of certainty for internodes considering the frequency of the bipartition (Salichos et al., 2014).We also calculated gene concordance factors (gCF) using IQ-TREE, with the ASTRAL species tree as the mapping tree for ML, BI, and MP gene trees.As gene trees constructed with different methods showed similar phylogenetic conflicts, we only used the ML gene trees in subsequent analyses.We compared the topologies of 2587 rooted ML gene trees, and the 10 most frequently occurring topologies were documented.To investigate conflicts within Parthenocissus, we also compared the topologies of 2587 unrooted single gene trees.We assessed all three possible topologies for the four species in the trifoliolate lineage.SPLIT-STREE 4.13.1 (Huson & Bryant, 2006) was used to visualize conflicts within Parthenocissus among 2587 rooted single gene trees.We collapsed all nodes below the best BS cutoff to avoid overestimating conflicts among single gene trees in SPLITSTREE.

Tests for the effects of ILS and gene tree estimation error on gene tree conflicts
We investigated whether the observed gene tree discordance could be explained by ILS or gene tree estimation error as described by Cai et al. (2021).We carried out coalescent simulations to test whether ILS alone can explain gene tree discordance.The population mutation parameter 'theta' of each internal branch was used to estimate the level of ILS, which was calculated by dividing branch length in mutation units by length in coalescent units.We also simulated 2587 gene trees under the MSC model with the R package PHYBASE 1.4 (Liu & Yu, 2010).Thereafter, the distributions of Robinson-Foulds distances (Robinson & Foulds, 1981) between the species tree and the simulated gene trees, and those between the species tree and the empirical gene trees, were compared.The effect of gene tree estimation error was quantified by bipartition information of the species tree based on simulated gene trees generated by RAXML v.8.2.12 (Stamatakis, 2014) with the '-f b' option, which shows how often each node of the species tree is recovered by the simulated gene trees.

Gene flow inference
To construct a framework of reticulate evolution in Parthenocissus, and to test whether plastid capture events left traces in the genome, we identified gene flow using a phylogenetic network inference and the ABBA-BABA test.We first used the 13taxa-2587nu dataset (Table S4) to infer the optimal network in Parthenocissus with Species Networks applying Quartets (SNAQ; Sol ıs-Lemus et al., 2017) implemented in PHYLONETWORKS (https://github.com/crsl4/PhyloNetworks.j1), by calculating the maximum pseudolikelihood of a network from four-taxon concordance factors (CFs).ASTRAL MSC species tree was selected as the starting tree.Concordance factor tables were generated with ML gene trees (An e et al., 2007;Larget et al., 2010) following the Tree Incongruence Checking in R (TICR) pipeline (Stenz et al., 2015).We ran SNAQ with possible maximum hybrid node number (hmax) from 0 to 10 for 100 searches to determine the optimal hmax based on the slope of a plot of Àlogplik against maximum hybridization number (Sol ıs-Lemus & An e, 2016).The network with the optimal hmax was selected as the starting network for bootstrap analysis with 100 replicates and 100 SNAQ searches.Phylogenetic networks for different subsets were estimated as described above, but with the hmax varying from 0 to 5. To test whether outgroup selection affected the network inference, each subset was analyzed with different outgroups (Table S4).
ABBA-BABA tests were conducted to identify gene flow between and within clades using the 'CalcD' function in the R package EVOBIR 1.3 (Blackmon & Adams, 2015).We tested all 102 possible combinations of three species from each of the trifoliolate, dalzielii, henryana lineages, and the North American clade, with Y. thomsonii as the outgroup.In addition, we used 17 quartets representing all possible combinations of three species within each lineage-subset with different outgroups, to test introgression within each lineage.The taxon compositions of all the quartets are shown in Table S5.We conducted the ABBA-BABA test using gene alignments obtained by concatenating and pruning the alignments of the subset datasets and the 13taxa-2587nu dataset.To obtain a Z-score for the null hypothesis of no introgression, 1000 bootstrap replicates were conducted in each test.Z-scores statistically deviating from the null expectation (Z > 3) indicate introgression events within the quartet.P-values were corrected with the Benjamini-Hochberg method (Benjamini & Hochberg, 1995) to control multiple comparisons.

Divergence time estimation
To investigate the divergence history of Parthenocissus, especially of the trifoliolate lineages, divergence time estimation was conducted with the 40-plastome dataset and 15taxa-1101nu dataset using MCMCTREE in the PAML 4.9j package (Yang, 2007).For the plastome dataset, Leea indica (Burm.f.) Merr.was used as the outgroup.A secondary calibration point was applied to constrain the stem age of Vitaceae, and two fossils were assigned to internal nodes (see stars in Fig. S2).Calibration strategies are detailed in Methods S1.MCMCTREE was run with the following settings: birth-death model, correlated rates, and HKY85 substitution model with alpha = 0.5.We combined the results of two independent MCMC chains, where samples were drawn every 10 000 generations with the first 20% iterations discarded as burnin, New Phytologist (2022) www.newphytologist.comÓ 2022 The Authors New Phytologist Ó 2022 New Phytologist Foundation.This article has been partially contributed to by US Government employees and their work is in the public domain in the USA.
until effective sample size (ESS) > 200 for all parameters.We also estimated divergence time using 15taxa-1101nu dataset partitioned by codon position (first/second/third), with calibration points in Methods S1 (see stars in Fig. S3), and the same settings of MCMCTREE.We also ran MCMCTREE without sequence data and compared marginal prior distributions (e.g.distribution results from prior interactions) to posterior distributions to check prior interactions (Fig. S4).

Ancestral area reconstruction
We used the Bayesian Binary MCMC (BBM) method in RASP 4.1 (Yu et al., 2015) to estimate the ancestral areas of Parthenocissus with the topologies based on 40-plastome dataset and 15taxa-1101nu dataset (Table S4).Four geographic regions representing the current distribution of Parthenocissus were defined: A, Qinghai-Tibet Plateau (QTP); B, East Asia (including eastern China, Japan, and the Korean Peninsula); C, Southeast Asia (including Indochina, Java, and the Indian subcontinent); and D, North America.We set the root distribution to null, applied 10 MCMC chains with the F81 + G model, and sampled the posterior distribution every 1000 generations discarding the first 20% as burnin, until ESS for all parameters > 200.
Distribution data for Parthenocissus were mainly assembled based on specimens from the Chinese Virtual Herbarium (http:// www.cvh.org.cn),Global Biodiversity Information Facility (GBIF.org., 2016), and other herbarium specimens we examined.We confirmed the identification of individual specimens and assessed the altitudinal range of species in the trifoliolate lineage to evaluate whether topography was involved in species diversification in the HHM region.The strategy for retaining natural distribution records and ensuring each species' distribution range is further described in Methods S2.

Genetic structure and phylogenetic analyses of the trifoliolate lineage
We used a population-level dataset to investigate the hybridization zones of the trifoliolate lineage species.Coding sequences of 212 individuals from 37 populations were extracted by HYBPIPER 1.3.1 (Johnson et al., 2016) with the 2587 transcriptome orthologs as references and were aligned by MAFFT.For each gene alignment, sites with missing data > 50% were excluded.Then, we extracted SNPs from the remained 1091 ortholog alignments with SNP-SITES (Page et al., 2016) and conducted linkage disequilibrium (LD) pruning using PLINK 1.9 (Purcell et al., 2007) to filter out highly correlated SNPs for genetic structure analysis, with option '--indep-pairwise' and parameters '200 100 0.2'.We identified genetic subgroups with a Bayesian analysis in STRUC-TURE 2.3.4 (Pritchard et al., 2000) using the admixture model with K = 2-8 and chose the optimal K value using STRUCTURE HARVESTER (Earl & VonHoldt, 2012) based on the maximum Delta K value (Evanno et al., 2005).To statistically evaluate the spatial concentration of gene flow, we performed Pearson correlation analyses between introgression intensity and geographic distance for three species pairs using the R package HMIST 4.6-0 (Harrell, 2021).Introgression intensity was denoted by the mean f d statistic between population pairs calculated by 'ABBABABAwindows.py'script (Martin et al., 2015).We obtained the population MSC tree with ASTRAL using 1021 gene trees constructed by IQ-TREE.We also constructed an ML population tree in IQ-TREE with a plastome dataset including 212 individuals.

Phylogenetic relationships based on the plastome and transcriptome data
To resolve phylogenetic relationships among species of Parthenocissus, 13 whole plastomes of Parthenocissus were assembled (Table S1).Phylogenetic analyses based on plastomes showed four well-supported clades within Parthenocissus (all with BS = 100%; Fig. 1a).The North American clade of three species formed a sister group to the Asian clade (Fig. 1a).Within the Asian clade, three well-supported lineages were identified (Fig. 1a): the henryana lineage with 5-foliolate leaves including P. henryana and P. laetevirens; the dalzielii lineage with both simple and trifoliolate leaves including P. dalzielii, P. tricuspidata, and P. suberosa; and the trifoliolate lineage with trifoliolate leaves including P. chinensis, P. feddei, P. heterophylla, and P. semicordata.Moreover, relationships among these three Asian lineages were well-resolved, with the henryana lineage diverging first and the dalzielii and trifoliolate lineages forming sister-groups.
The ML and SVDQUARTETS methods conducted on the 13taxa-2587nu dataset yielded well-supported and identical topologies.We also obtained identical ASTRAL MSC species trees with high supporting values regardless of the BS cutoff, but we selected the species tree constructed from gene trees with node BS values ≥ 10% for subsequent analysis to reduce potential noises, as suggested by Zhang et al. (2018).Well-supported cytonuclear phylogenetic incongruences were detected within the North American clade and the trifoliolate lineage (Fig. 1).Within the North American clade, the plastome data supported a closer relationship between P. vitacea and P. quinquefolia, while the transcriptome data placed P. vitacea together with P. heptaphylla (Fig. 1).Within the trifoliolate lineage, the plastome data supported P. heterophylla as sister to P. chinensis, while the transcriptome data identified a closer relationship between P. chinensis and P. semicordata (Fig. 1).

Topological conflicts among transcriptome orthologs
The gene tree topologies of 2587 single-copy orthologs were highly variable (Figs S5, S6; Table S6).In the trifoliolate lineage, most gene trees were congruent with the transcriptome phylogeny.Topology supporting P. chinensis and P. heterophylla as sister groups is secondarily dominant (Fig. S7).
SPLITSTREE detected conflicts from deep to shallow nodes of Parthenocissus, and the trifoliolate lineage showed particularly significant conflicts (Fig. S8).Gene trees constructed with different methods showed similar patterns of low ICA value and gene Ó 2022 The Authors New Phytologist Ó 2022 New Phytologist Foundation.This article has been partially contributed to by US Government employees and their work is in the public domain in the USA.

Effects of ILS and gene tree estimation on conflicts and identification of reticulation events
The difference between distribution of Robinson-Foulds distances calculated with simulated and empirical gene trees indicates that, in addition to ILS, hybridization can also explain gene tree discordance (Fig. S10).Theta, calculated from the ML tree and the MSC species tree, showed weak signals of ILS for most branches, except for the stem of the Asian clade with a short internal branch (Fig. 1b).The stem of the Asian clade was also affected by gene tree estimation error, as suggested by the low recovery of simulated gene trees with this lineage (Fig. 1b).
SNAQ runs performed on the whole genus suggested the optimal value of hmax = 4, but only three hybridization events have BS > 10% (Figs 2, S11a).The highest supported introgression was detected within the North American clade between P. vitacea and P. quinquefolia (BS = 46%; Fig. 2a).Introgression between P. suberosa and the common ancestor of the dazielii lineage and minor gene flow from P. chinensis to P. feddei were also detected.SNAQ searches conducted on the subsets obtained bettersupported results 3).Two introgression events between (or within) the dalzielii and trifoliolate lineages were identified (Fig. 2b): (1) gene flow from P. suberosa to the common ancestor of P. heterophylla-P.semicordata-P.chinensis (BS = 95%) and (2) gene flow from P. heterophylla to P. semicordata (BS = 100%).When the henryana lineage was used as an outgroup, no gene flow with BS > 10% was identified (Fig. 2c).The introgression between P. heptaphylla and the ancestor of the North American clade was identified despite using different outgroups (Fig. 2d-f).In the trifoliolate lineage, gene flow from P. semicordata to P. heterophylla (Fig. 3a,d-f) was identified with the henryana lineage as outgroup, but the direction of gene flow reversed when P. dalzielii was used as outgroup and two P. heterophylla individuals were separately included (Fig. 3b-c).
ABBA-BABA tests supported introgression events between the North American clade and the trifoliolate lineage or the henryana lineage with significant Z-scores (Fig. 4; Table S5).Introgressions were identified within all three subsets.In the trifoliolate lineage, introgressions between P. semicordata and P. heterophylla were strongly supported, whereas introgressions between P. chinensis-P.feddei or P. semicordata-P.feddei were not consistently supported.In the dalzielii lineage and the North American clade, introgressions were detected irrespective of the selected outgroup (Fig. 4; Table S5).

Divergence times and biogeographic reconstruction
The genus Parthenocissus was estimated to have split with Yua in East Asia (Node 1 in Figs 5, S12) during the early Eocene (Node the North American lineage (Node 3 in Fig. S2), the Asian clade (Node 4 in Fig. S2), and the henryana lineage (Node 5 in Fig. S2).As concerted evolution between genes can result in slower nucleotide substitution rates in plastid genes (Okuyama et al., 2005;Dong et al., 2021), we primarily used the divergence times estimated from the 15taxa-1101nu dataset for discussion.
At K = 4, individuals of P. chinensis and P. semicordata were identified as two distinct clusters, though obvious admixture still exists between the two clusters (Fig. 6a).The Pearson r correlation showed that the mean f d statistic was negatively correlated with geographic distance for population pairs of P. semicordata-P.heterophylla, P. chinensis-P.heterophylla, and P. semicordata-P.chinensis (Fig. S13).Phylogenetic analyses based on the 212individuals-1021nu and 212-plastome datasets resulted in conflicting relationships, with a few individuals clustered in unexpected positions compared with topologies obtained from the 13-taxa-2587nu and 40-plastome datasets (Fig. S14).In the nuclear phylogeny, six individuals from SE populations distributed along the Himalaya formed a sister clade to the clade of P. chinensis and the remaining individuals of P. semicordata (Fig. S14a).The plastid phylogeny showed that individuals of CH1, HE6, and some SE populations at the southern edge of the HHM formed a clade sister to the P. chinensis clade (Fig. S14b).In addition, the individuals of CH13 and one individual of CH14 in the southern border New Phytologist (2022) www.newphytologist.com of the Hengduan Mountains were nested within the P. heterophylla clade (Fig. S14b).

Reticulate evolution in Parthenocissus in a temporal-spatial framework
We establish a well-supported temporal-spatial evolutionary history of Parthenocissus with reticulation events between and within lineages.Parthenocissus is estimated to have originated from East Asia during the Early Eocene Climatic Optimum (Node 1 in Figs 5, S2, S3).The Asian and North American clades of Parthenocissus split in the Eocene (Node 2 in Figs 5, S2, S3), but ABBA-BABA tests and extensive gene tree conflicts elucidate gene flow between the North American clade and the henryana or trifoliolate lineages (Fig. 4).This suggests that the ancestors of lineages in the North American clade and the Asian clade might have come into contact after their early divergence at the beginning of the first major glaciation of the Oligocene (Oi-1 Glaciation; Figs 5, S12).Both the North Atlantic land bridges (Tiffney, 1985) and the Bering land bridge (Wen et al., 2016) might have played important roles in intercontinental dispersal and range expansion of Parthenocissus between Asia and North America.Fossils similar to extant Parthenocissus species have been recorded from both Europe and western North America (Fig. 5; Tiffney & Manchester, 2001;Morley, 2003), suggesting climatedriven extinctions in these regions might have led to the intercontinental disjunct pattern between Asia and primarily eastern to southwestern North America in Parthenocissus.However, it is worth noting that the extinctions in North America and Europe may bias the ancestral area reconstruction of Parthenocissus.
In the North American clade, the plastid phylogeny suggests a sister relationship between P. vitacea and P. quinquefolia (Fig. 1a) with overlapping distributions throughout North America, whereas P. heptaphylla, the sister group of P. vitacea according to the transcriptome dataset (Fig. 1b), is narrowly endemic to central Texas (Brizicky, 1965;Moore & Wen, 2016).Our results are concordant with the phenomenon of plastid capture, where plastid phylogeny is usually strongly correlated with geographic distribution (Rieseberg & Soltis, 1991;Whittemore & Schaal, 1991;Cristina Acosta & Premoli, 2010).The large divergence time gap in the North American clade based on plastid and nuclear data, with the plastid estimate much younger than that of the nuclear, also implies plastid capture as suggested by previous studies (Drew & Sytsma, 2013;Huang et al., 2014).Plastid capture in the North American clade might have occurred in the Pliocene after the diversification of the clade during the Miocene (Figs 5, S12).Furthermore, our SNAQ searches detect gene flow between P. heptaphylla and the ancestor of the North American clade (Fig. 2d-f).Although gene flow between the ancestor of a clade and its descendants might be an artifact due to limitations of SNAQ on detecting multiple introgressions (Karimi et al., 2020), it could also be explained by a 'ghost' lineage (Norell, 1993), diverged from the stem group that contributed genetic material to P. heptaphylla, which also leads to a 'BABA' pattern in the ABBA-BABA test (Durand et al., 2011;Zheng & Janke, 2018).The 'ghost' lineage may have gone extinct during the aridification of the west and central North America that has occurred since the Eocene (Sheldon & Retallack, 2004), as evidenced by fossil seeds of Parthenocissus from these regions dating to the Eocene and Miocene (Tiffney & Manchester, 2001;Morley, 2003;Chen, 2009).
In the Asian clade, especially in the trifoliolate lineage, abundant gene flow across species/lineages has been detected by SNAQ searches and ABBA-BABA tests (Figs 2-4).As frequent hybridizations may obscure signals of network inference and rigorous statistical analyses (Karimi et al., 2020), the evolutionary history of the trifoliolate lineage in the HHM region will be discussed below incorporating population-level evidence.

Hybridization hot spots and ongoing speciation reversal in the HHM region
The trifoliolate lineage is inferred to have split from the dalzielii lineage in East Asia during the Oi-1 Glaciation (Node 6 in Figs 5, S12).Diversification and habitat expansion within the trifoliolate lineage began during the Oligocene, with the warm-adapted P. heterophylla now distributed southward to Java through Indochina and the cold-adapted P. semicordata westward to the HHM (Figs 5,S12).Before the middle Miocene, most parts of the Hengduan Mountains have achieved their near-modern elevation (S.H. Li et al., 2020;Spicer et al., 2020).Erosion (e.g.river incision) in uplifted regions may have been enhanced by the intensification of the East Asian monsoon and Indian monsoon in the warming Miocene (Nie et al., 2018), which caused greater heterogeneity in geology, elevation, and soil (Antonelli et al., 2018).The emergence of abundant new ecological niches might have facilitated diversification of the trifoliolate lineage in the HHM.
The STRUCTURE results and the population-level phylogenies identify at least three potential hybridization hot spots around the HHM where multiple species co-occur (Figs 6,S14).The negative correlation between introgression intensity and geographic distance also reveals a spatial concentration of gene flow and therefore further supports the hybridization hot spots we propose (Fig. S13).
One hybridization hot spot involves the Himalaya, with great differences in elevation, where P. semicordata and P. heterophylla co-occur (Fig. 6b).SNAQ searches and ABBA-BABA tests strongly support gene flow between P. semicordata and P. heterophylla.However, it is difficult to determine the direction of gene flow, as subsets with different outgroups suggest opposite directions (Fig. 3) probably due to sensitivity of SNAQ to taxon sampling (Karimi et al., 2020).We therefore recommend being cautious when selecting outgroup taxa for network inference.However, the admixed genomes of both species in the STRUCTURE analysis (Fig. 6a) tend to support bidirectional gene flow with stronger signals from P. semicordata to P. heterophylla.The phylogeny based on the 212-plastome dataset also indicates introgressions between the two species (Fig. S14).Generally, P. semicordata occurs in the HHM at an average elevation of c. 2500 m, but the lower altitude in the southern edge of the Himalaya allows some populations of P. semicordata to hybridize with P. heterophylla, which occupies a warmer zone with an average elevation of c. 1600 m (Fig. 6b).
The second hybridization hot spot is located at the southern border of the Hengduan Mountains where P. chinensis and P. heterophylla co-occur (Fig. 6b).Two populations of P. chinensis from Yingjiang and Malipo, Yunnan, China (CH13 and CH14, near the distribution range of P. heterophylla in Vietnam and Myanmar), were nested in the P. heterophylla clade in the plastid phylogeny, indicating hybridizations at low frequency (Fig. S14).This introgression was not detected by SNAQ or ABBA-BABA tests, possibly due to plastid capture or sampling bias.Notably, our population-level analyses do not support any introgressions involving P. feddei (Figs 6a, S14) as the SNAQ searches and ABBA-BABA tests suggest (Figs 2, 4), although some populations of P. feddei co-occur with P. chinensis and P. heterophylla in this area.The introgressions involving P. feddei might be an artifact caused by gene flow forming intersect loops on one quartet (Karimi et al., 2020).The unique limestone habitat of P. feddei may have played a role in minimizing gene flow with other species.The lower elevation and lesscomplex topographic heterogeneity in this area may explain a stronger correlation between introgression intensity and geographic distance in P. chinensis-P.heterophylla than other species pairs (Fig. S13b).
The third hybridization hot spot is located at the southern edge of the HHM with a complex topography where P. semicordata and P. chinensis co-occur (Fig. 6b).Although SNAQ analysis and ABBA-BABA tests struggle to detect introgressions between sister groups, the population structure analyses and the population-level phylogenies show admixture of the two species (Figs 6, S14).Our population structure analysis clusters P. semicordata and P. chinensis together when employing the optimal K = 3, suggesting highly admixed genomes of the two species (Fig. 6).When K = 4, genetic differences between the two species, involving populations from a wide range, increase along with geographic distances (Figs 6, S13), resulting in a classical clinal pattern commonly caused by frequent secondary contact between diverged lineages (Gompert & Buerkle, 2016).Notably, most populations of P. chinensis have received genetic contributions from P. semicordata and vice versa (Fig. 6), leading to the ongoing replacement of the two pure genomes with hybridized ones.This geographic structure might be a consequence of random interbreeding for generations without strong selection against hybrids (Todesco et al., 2016).The conflicting gene trees further suggest mosaic genomes of the two species, with two dominant topologies, respectively, supported by plastomes and reflecting introgressions (Fig. S7).Recent speciation with gene flow may also cause admixed genomes and conflicting topologies, but is less likely here, as our plastome data show that P. semicordata and P. chinensis are distinct lineages persisting through geological time (Fig. S2).If plastid capture caused such patterns, we would expect historical gene flow between P. semicordata and species outside the trifoliolate lineage, or between P. chinensis and P. heterophylla, and P. chinensis and P. feddei.However, these introgressions are not consistently supported, and evidence for this hypothesis remains lacking.To summarize, the wide hybridization zone, tendency to replace parental lineages, and mosaic genomes of P. semicordata and P. chinensis all support possible ongoing speciation reversal between P. semicordata and P. chinensis (Jacobsen & Omland, 2011;Kearns et al., 2018).
The ongoing speciation reversal unveiled by this study is most likely driven by geographic and climatic changes via natural processes, that is, the active paleoenvironmental changes in the HHM since the late Miocene (S. H. Li et al., 2020;Spicer et al., 2020), although human factors cannot be excluded.While speciation reversal can cause biodiversity decline (Seehausen, 2006;Zhang et al., 2019), the hybrid derivative lineages can adapt to a changing heterogeneous environment as new taxa with recombinant genomes and even exhibit an advantage over their parental lineages when facing environmental change (von-Holdt et al., 2016;Edelman et al., 2019).With the surviving genomic fractions in the gene pool of extant species, the fused lineages can potentially diverge into multiple species under certain circumstances.In the HHM region, besides species diversification that has been intensively studied, the largely underestimated speciation reversal processes might also have played an important role in the formation of biodiversity cradles.Therefore, further consideration of protecting hybrid populations and their living habitats is required to sustain the potential of generating future biodiversity in response to accelerated climatic changes.Table S1 Detailed information for transcriptome and plastome sequencing species.
Table S2 Species and sources of the plastome and transcriptome data for divergence time estimation.
Table S3 Details of location and size of each population for the four species, P. chinensis (CH), P. feddei (FE), P. heterophylla (HE), and P. semicordata (SE), in the trifoliolate lineage.
Table S4 Datasets used and analyses conducted for this study.
Table S5 ABBA-BABA test results on quartets of the combinations of species from each Asian lineage or the North American clade, and combinations of three species within each lineage-subset.
Table S6 Maximum likelihood bootstrap values for single gene trees supporting the 10 most frequently occurring topologies.
Please note: Wiley is not responsible for the content or functionality of any Supporting Information supplied by the authors.Any queries (other than missing material) should be directed to the New Phytologist Central Office.
New Phytologist (2022) www.newphytologist.comÓ 2022 The Authors New Phytologist Ó 2022 New Phytologist Foundation.This article has been partially contributed to by US Government employees and their work is in the public domain in the USA.

Fig. 1
Fig. 1 Topological discordance between (a) the phylogenetic relationships of Parthenocissus based on the 40-plastome dataset with all nodes supported by 100% maximum likelihood bootstrap support (BS) value and (b) the phylogenetic relationships of Parthenocissus based on 13taxa-2587nu dataset withinternode certainty all values/gene concordance factors/bipartition information restored from simulated gene trees shown above the branches.ASTRAL and SVDQUARTETS generated identical topologies with all the branches of 100% BS value.The multispecies coalescent species tree generated by ASTRAL is presented to show branch lengths.The pie charts at each node represent the proportion of genes supporting congruent relationships (blue), the dominant alternative (green), the remaining conflicting alternatives (yellow), and genes with < 75% BS value for that node (gray).Internal branches are colored to show population mutation parameter theta.

Fig. 2
Fig. 2 Reticulations in Parthenocissus estimated with SNAQ on (a) 13taxa-2587nu dataset of Parthenocissus, (b, c) the dalzielii subset with different outgroups, and (d-f) the North American (NA) subset with different outgroups.Hybrid edges with bootstrap support (BS) values ≥ 10% are denoted by blue dashed lines.Minor inheritance probabilities and BS values are shown in blue and black numbers along the edges, respectively.Edge lengths are shown in coalescent units, with the length of each terminal branch set as 1. 'ID' represents the individual from Indonesia.

Fig. 3
Fig. 3 Reticulations in Parthenocissus estimated with SNAQ on the trifoliolate subset, respectively with (a-c) P. dalzielii and (d-f) the henryana lineage as outgroup.Individuals from India (IN) and Indonesia (ID) were both or separately included to test sampling sensitivity of the method.Hybrid edges with bootstrap support (BS) values are denoted by blue dashed lines.Minor inheritance probabilities and BS values are shown in blue and black numbers along the edges, respectively.Edge lengths are shown in coalescent units, with the length of each terminal branch set as 1.

Fig. 4
Fig.4Z-scores of gene flow (denoted on the x-axis) between or within three lineages (trifoliolate, dalzielii, and henryana) and the North American clade of Parthenocissus that were recognized as statistically significant (Z > 3) in at least one quartet.Z-scores were obtained from 1000 bootstrap replicates.Combinations of species or the lineages/clade they belong to are listed on the right, as indicated by different colors.Names of lineages were simplified; 'NA' represents the North American clade.

Fig. 5
Fig. 5 Ancestral area reconstructions for Parthenocissus in RASP using the MCMCTREE-derived chronogram based on nuclear genes (15taxa-1101nu dataset).Node numbers and mean node ages are displayed at the corresponding nodes.The red line represents the global temperature changes (from Westerhold et al., 2020).The black symbols represent locations of fossils for Parthenocissus summarized by Chen (2009).The pie charts indicate the relative possibilities of ancestral areas estimated.A, Qinghai-Tibet Plateau; B, East Asia (including eastern China, Japan, and the Korean Peninsula); C, Southeast Asia (including Indochina, Java, and the Indian subcontinent); and D, North America.Ma, million years ago.

Fig. 6
Fig. 6 Population structure and biogeographic distribution of the four species, Parthenocissus chinensis (CH), P. feddei (FE), P. heterophylla (HE), and P. semicordata (SE), in the trifoliolate lineage.(a) Bayesian clustering of population structure analysis in STRUCTURE based on a dataset of 1448 SNPs for 37 populations of the trifoliolate lineage with K = 2-5 (optimal K = 3 shown in bold).Each color represents one ancestral population, and each bar represents a single accession.The length of each colored segment in the bar represents the admixture proportion contributed by that ancestral population.(b) The distributions of the four species in the trifoliolate lineage based on 809 occurrence records.Population codes are provided in TableS3.Black stars indicate the localities of individuals with sequenced transcriptomes.Three potential hybridization hot spots are indicated with red dashed lines.Boxplots on the bottom left show the elevation range of each species, denoting the median (vertical line), 25 th and 75 th percentiles (left and right bounds of the box), and limits of the 95% minimum or maximum values (lower and upper whiskers), with outliers shown as dots.

Fig. S1
Fig. S1 Phylogeny and distribution of Parthenocissus according to previous study.

Fig. S2
Fig. S2 Chronogram of Parthenocissus inferred from MCMCTREE in the PAML package based on a dataset including 40 plastomes of Vitaceae.

Fig. S4
Fig. S4 Posterior distributions and marginal prior distributions of the calibrated nodes in divergence time estimation based on plastid and nuclear gene datasets.

Fig. S5
Fig. S5 Ten most frequently occurring topologies of 2587 rooted maximum likelihood gene trees.

Fig. S6
Fig. S6 Ten most frequently occurring unrooted topologies of 2587 unrooted maximum likelihood gene trees.

Fig. S7
Fig. S7 Topology types of 2587 maximum likelihood gene trees when focusing on the trifoliolate lineage.

Fig. S8 A
Fig. S8 A supernetwork constructed with SPLITSTREE based on 2587 rooted maximum likelihood gene trees.

Fig. S9
Fig.S9Phylogenies generated by ASTRAL with gene trees constructed using maximum parsimony and Bayesian inference approach.

Fig. S10
Fig. S10 Distribution of Robinson-Foulds distances calculated with simulated and empirical gene trees.

Fig. S12
Fig. S12 Ancestral area reconstructions for Parthenocissus in RASP using the MCMCTREE-derived chronogram based on plastome sequences.

Fig. S13
Fig. S13 Pearson correlation analyses showing negative correlation between introgression intensity and geographic distance.

Fig. S14
Fig. S14 Phylogenetic relationships within the trifoliolate lineage with two individuals of Parthenocissus dalzielii as the outgroup.Methods S1 Detailed information of calibration strategies.Methods S2 Strategies for retaining only natural distribution records and ensuring each species' distribution range.