Mobile

banner image

watermelon accessions identifies

Resequencing of 414 cultivated and wild watermelon accessions identifies selection for fruit quality traits

An improved watermelon reference genome



The genome of watermelon cultivar ‘97103’ was previously assembled using Illumina short reads6. To improve its quality, we assembled the ‘97103’ genome de novo using PacBio long reads, combined with BioNano optical and Hi-C chromatin interaction maps. In total, 20.3 Gb of PacBio sequences were generated with an N50 length of 10.8 kb, covering approximately 47.2× of the watermelon genome. The PacBio assembly had a total size of 359.8 Mb, containing 367 contigs with an N50 size of 2.3 Mb. In total, 410.7 Gb cleaned BioNano optical map data were generated and assembled de novo into BioNano genome maps, which were used to connect PacBio assembled contigs, resulting in 149 scaffolds with an N50 size of 21.9 Mb and a cumulative length of 365.1 Mb.

 Furthermore, approximately 135.2 million cleaned Hi-C reads were generated, of which roughly 92.1 million (68.1%) were uniquely mapped to the assembly, resulting in a final set of approximately 69.5 million valid read pairs that were used to generate contact information (Supplementary Table 1). The Hi-C contact information, combined with previously published genetic maps7,8,9, was used to order and orient the scaffolds into chromosome-scale pseudomolecules. Finally, 31 scaffolds with a total size of 362.7 Mb (99.3% of the assembly) were clustered into 11 chromosomes ranging from 27.1 to 37.9 Mb in length (Extended Data Figs. 1–3).

Comparison with the previous assembly of ‘97103’ suggested high collinearity between the two assemblies (Extended Data Fig. 4). Comprehensive assessment indicated that the quality of this new assembly was high and substantially improved compared to the previous assembly (Supplementary Note, Supplementary Tables 2,3 and Extended Data Figs. 5–7). Approximately 55.55% of the assembly was annotated as repeat sequences, a substantially higher percentage than in the previous assembly (46.60%) (Supplementary Table 4), and 22,596 high-confidence genes were predicted in the assembly (Supplementary Note). This much improved ‘97103’ genome provides a robust reference for watermelon research and genetic improvement.

Also Read This Article:-



Genome variation map and phylogeny of Citrullus species

In total, 414 Citrullus accessions collected in various geographic regions (Fig. 1a) were selected for genome resequencing, including 15 C. colocynthis, 31 C. amarus, 19 C. mucosospermus, 345 C. lanatus (258 cultivars and 87 landraces), two C. rehmii, one C. ecirrhosus and one C. naudinianus accessions (Supplementary Table 5). These accessions were sequenced to an average depth of 14.5× and coverage of 92.2% of the ‘97103’ genome. In total, 19,725,853 SNPs were identified, of which 1,100,803 were located in coding regions, causing 502,028 nonsynonymous mutations, 589,735 synonymous mutations, 1,031 start codon changes and 6,808 stop codon changes. Furthermore, 6,675,290 small indels were identified, of which 56,115 were located in coding regions.

Fig. 1: Phylogenetic relationships and population structure of resequenced accessions from the seven Citrullus species.
a, Geographic distribution of resequenced Citrullus accessions. The diameter of the circle is proportional to the number of accessions, maximized at the size for 100 accessions. The world map was generated using the R package ‘rworldmap’ (v1.3-6, https://cran.r-project.org/web/packages/rworldmap/index.html). b, Neighbor-joining phylogenetic tree of Citrullus accessions and model-based clustering with K from 2 to 4. Colors of branches in the tree indicate different species (matching the colors shown in a). Two C.

lanatus accessions from Sudan located in the deepest branch of the C. lanatus clade are indicated by the arrow. c, Principal component analysis of Citrullus accessions excluding C. naudinianus (left), and of C. mucosospermus and C. lanatus accessions (right). PC1, first principal component; PC2, second principal component. d, Schematic representation of Patterson’s D test of gene flow between Citrullus species. Red arrows represent gene flow between lineages. P values for significant deviations of D from zero are shown near the dots representing D values. The bars represent standard errors. CA, C. amarus; CC, C. colocynthis; CE, C. ecirrhosus; CL_CUL, C. lanatus cultivar; CL_LR, C. lanatus landrace; CM, C. mucosospermus; CN, C. naudinianus.

Phylogenetic relationships between the Citrullus accessions were inferred using 89,914 SNPs at fourfold degenerate sites. The placement of the seven species in the phylogenetic tree (Fig. 1b) was largely consistent with the previously reported phylogeny4,5, with the most morphologically distinct Citrullus species, C. naudinianus, sister to the other six species, followed by C. colocynthis and C. rehmii. However, C. ecirrhosus was sister to C. amarus, C. mucosospermus and C. lanatus, instead of being most closely related to C. amarus, as proposed previously4.

 Two C. lanatus accessions collected in Sudan (PI 481871 and PI 254622) were placed in the deepest branch of the C. lanatus clade (Fig. 1b), supporting the idea that the primitive watermelons from Sudan and neighboring countries of northeastern Africa may be the closest to the progenitor of the sweet watermelon2,5,10. Twelve accessions were clustered into unexpected species groups and were therefore excluded from downstream analyses (Supplementary Table 5).
watermelon accessions identifies watermelon accessions identifies Reviewed by Earth Edition on November 02, 2019 Rating: 5

No comments:

Powered by Blogger.