2011 publications

Progress towards a reference genome for sunflower

Botany (89:429-437)

Added on : 19 June 2012

Read more

Authors

Kane, N.C., Gill, N., King, M.G., Bowers, J.E., Berges, H., Gouzy, J., Bachlava, E., Langlade, N.B., Lai, Z., Stewart, M., Burke, J.M., Vincourt, P., Knapp, S.J., and Rieseberg L.H.

Botany (89:429-437)

Abstract

The Compositae is one of the largest and most economically important families of flowering plants and includes a diverse array of food crops, horticultural crops, medicinals, and noxious weeds. Despite its size and economic importance, there is no reference genome sequence for the Compositae, which impedes research and improvement efforts. We report on progress toward sequencing the 3.5 Gb genome of cultivated sunflower (Helianthus annuus), the most important crop in the family. Our sequencing strategy combines whole-genome shotgun sequencing using the Solexa and 454 platforms with the generation of high-density genetic and physical maps that serve as scaffolds for the linear assembly of whole-genome shotgun sequences. The performance of this approach is enhanced by the construction of a sequence-based physical map, which provides unique sequence-based tags every 5–6 kb across the genome. Thus far, our physical map covers ~ 85% of the sunflower genome, and we have generated ~ 80× genome coverage with Solexa reads and 15.5× with 454 reads. Preliminary analyses indicated that ~ 78% of the sunflower genome consists of repetitive sequences. Nonetheless, ~ 76% of contigs >5 kb in size can be assigned to either the physical or genetic map or to both, suggesting that our approach is likely to deliver a highly accurate and contiguous reference genome for sunflower.
 

The Sugarcane Genome Challenge: Strategies for Sequencing a Highly Complex Genome.

(2011) Tropical Plant Biology 4: 145-156, DOI: 10.1007/s12042-011-9079-0

Added on : 19 June 2012

Read more

Authors

Glaucia Mendes Souza, Helene Berges, Stephanie Bocs, Rosanne Casu, Angelique D’Hont, João Eduardo Ferreira, Robert Henry, Ray Ming, Bernard Potier and Marie-Anne Van Sluys, Michel Vincentz and Andrew H. Paterson

(2011) Tropical Plant Biology 4: 145-156, DOI: 10.1007/s12042-011-9079-0

Abstract

Sugarcane cultivars derive from interspecific hybrids obtained by crossing Saccharum officinarum and Saccharum spontaneum and provide feedstock used worldwide for sugar and biofuel production. The importance of sugarcane as a bioenergy feedstock has increased interest in the generation of new cultivars optimised for energy production. Cultivar improvement has relied largely on traditional breeding methods, which may be limited by the complexity of inheritance in interspecific polyploidy hybrids, and the time-consuming process of selection of plants with desired agronomic traits. In this sense, molecular genetics can assist in the process of developing improved cultivars by generating molecular markers that can be used in the breeding process or by introducing new genes into the sugarcane genome. For meeting each of these, and additional goals, biotechnologists would benefit from a reference genome sequence of a sugarcane cultivar. The sugarcane genome poses challenges that have not been addressed in any prior sequencing project, due to its highly polyploid and aneuploid genome structure with a complete set of homeologous genes predicted to range from 10 to 12 copies (alleles) and to include representatives from each of two different species. Although sugarcane’s monoploid genome is about 1 Gb, its highly polymorphic nature represents another significant challenge for obtaining a genuine assembled monoploid genome. With a rich resource of expressed-sequence tag (EST) data in the public domain, the present article describes tools and strategies that may aid in the generation of a reference genome sequence.

KeywordsSugarcane–Genome–Sequencing–Sorghum


Read more

Authors

Young ND, Debellé F, Oldroyd GE, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KF, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC, Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KA, Tang H, Rombauts S, Zhao PX, Zhou P, Barbe V, Bardou P, Bechner M, Bellec A, Berger A, Bergès H, Bidwell S, Bisseling T, Choisne N, Couloux A, Denny R, Deshpande S, Dai X, Doyle JJ, Dudez AM, Farmer AD, Fouteau S, Franken C, Gibelin C, Gish J, Goldstein S, González AJ, Green PJ, Hallab A, Hartog M, Hua A, Humphray SJ, Jeong DH, Jing Y, Jöcker A, Kenton SM, Kim DJ, Klee K, Lai H, Lang C, Lin S, Macmil SL, Magdelenat G, Matthews L, McCorrison J, Monaghan EL, Mun JH, Najar FZ, Nicholson C, Noirot C, O'Bleness M, Paule CR, Poulain J, Prion F, Qin B, Qu C, Retzel EF, Riddle C, Sallet E, Samain S, Samson N, Sanders I, Saurat O, Scarpelli C, Schiex T, Segurens B, Severin AJ, Sherrier DJ, Shi R, Sims S, Singer SR, Sinharoy S, Sterck L, Viollet A, Wang BB, Wang K, Wang M, Wang X, Warfsmann J, Weissenbach J, White DD, White JD, Wiley GB, Wincker P, Xing Y, Yang L, Yao Z, Ying F, Zhai J, Zhou L, Zuber A, Dénarié J, Dixon RA, May GD, Schwartz DC, Rogers J, Quétier F, Town CD, Roe BA.

Nature. 2011 Nov 16.

Abstract

Legumes (Fabaceae or Leguminosae) are unique among cultivated plants for their ability to carry out endosymbiotic nitrogen fixation with rhizobial bacteria, a process that takes place in a specialized structure known as the nodule. Legumes belong to one of the two main groups of eurosids, the Fabidae, which includes most species capable of endosymbiotic nitrogen fixation. Legumes comprise several evolutionary lineages derived from a common ancestor 60 million years ago (Myr ago). Papilionoids are the largest clade, dating nearly to the origin of legumes and containing most cultivated species. Medicago truncatula is a long-established model for the study of legume biology. Here we describe the draft sequence of the M. truncatula euchromatin based on a recently completed BAC assembly supplemented with Illumina shotgun sequence, together capturing ∼94% of all M. truncatula genes. A whole-genome duplication (WGD) approximately 58 Myr ago had a major role in shaping the M. truncatula genome and thereby contributed to the evolution of endosymbiotic nitrogen fixation. Subsequent to the WGD, the M. truncatula genome experienced higher levels of rearrangement than two other sequenced legumes, Glycine max and Lotus japonicus. M. truncatula is a close relative of alfalfa (Medicago sativa), a widely cultivated crop with limited genomics tools and complex autotetraploid genetics. As such, the M. truncatula genome sequence provides significant opportunities to expand alfalfa's genomic toolbox.


Read more

Authors

Rustenholz C, Choulet F, Laugier C, Safár J, Simková H, Dolezel J, Magni F, Scalabrin S, Cattonaro F, Vautrin S, Bellec A, Bergès H, Feuillet C, Paux E.

Plant Physiol. 2011 Oct 27.

Abstract

To improve our understanding of the organization and regulation of the wheat (Triticum aestivum L.) gene space, we established the first transcription map of a wheat chromosome (3B) by hybridizing a newly developed wheat expression microarray with BAC pools from a new version of the 3B physical map as well as with cDNA probes derived from 15 RNA samples. Mapping data for almost 3000 genes showed that the gene space spans the whole chromosome 3B with a twofold increase of gene density towards the telomeres due to an increase in the number of genes in islands. Comparative analyses with rice and Brachypodium revealed that these gene islands are composed mainly of genes likely originating from interchromosomal gene duplications. Gene ontology and expression profile analyses for the 3000 genes located along the chromosome revealed that the gene islands are enriched significantly in genes sharing the same function or expression profile, thereby suggesting that genes in islands acquired shared regulation during evolution. Only a small fraction of these clusters of cofunctional and coexpressed genes was conserved with rice and Brachypodium indicating a recent origin. Finally, genes with the same expression profiles in remote islands (coregulation islands) were identified suggesting long-distance regulation of gene expression along the chromosomes in wheat.

Functional features of a single chromosome arm in wheat (1AL) determined from its structure.

Funct Integr Genomics. 2011 Sep 3.

Added on : 06 October 2011

Read more

Authors :

Lucas SJ, Simková H, Safár J, Jurman I, Cattonaro F, Vautrin S, Bellec A, Berges H, Doležel J, Budak H.

Funct Integr Genomics. 2011 Sep 3.

Abstract :

Bread wheat (Triticum aestivum L.) is one of the most important crops globally and a high priority for genetic improvement, but its large and complex genome has been seen as intractable to whole genome sequencing. Isolation of individual wheat chromosome arms has facilitated large-scale sequence analyses. However, so far there is no such survey of sequences from the A genome of wheat. Greater understanding of an A chromosome could facilitate wheat improvement and future sequencing of the entire genome. We have constructed BAC library from the long arm of T. aestivum chromosome 1A (1AL) and obtained BAC end sequences from 7,470 clones encompassing the arm. We obtained 13,445 (89.99%) useful sequences with a cumulative length of

7.57 Mb, representing 1.43% of 1AL and about 0.14% of the entire A genome. The GC content of the sequences was 44.7%, and 90% of the chromosome was estimated to comprise repeat sequences, while just over 1% encoded expressed genes. From the sequence data, we identified a large number of sites suitable for development of molecular markers

(362 SSR and 6,948 ISBP) which will have utility for mapping this chromosome and for marker assisted breeding. From 44 putative ISBP markers tested 23 (52.3%) were found to be useful. The BAC end sequence data also enabled the identification of genes and syntenic blocks specific to chromosome 1AL, suggesting regions of particular functional interest and targets for future research.


Read more

Authors :

Faivre Rampant P, Lesur I, Boussardon C, Bitton F, Martin-Magniette ML, Bodenes C, Le Provost G, Berges H, Fluch S, Kremer A, Plomion C.

BMC Genomics. 2011 Jun 6;12(1):292

 

Abstract :

 

BACKGROUND:
One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences.

RESULTS:
The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera.

CONCLUSIONS:
This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak.

http://www.ncbi.nlm.nih.gov/pubmed/21645357


Read more

Authors :

Paiva JA, Prat E, Vautrin S, Santos MD, San-Clemente H, Brommonschenkel S, Fonseca PG, Grattapaglia D, Song X, Ammiraju JS, Kudrna D, Wing RA, Freitas AT, Berges H, Grima-Pettenati J.

BMC Genomics. 2011 Mar 4;12(1):137.

 

Abstract :

BACKGROUND: Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC) libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing.

RESULTS: We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1) digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135Kb (Eg_Bb) to 157Kb (Eg_Ba), very low extra-nuclear genome contamination providing a probability of finding a single copy gene [greater than or equal to] 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes.

CONCLUSIONS: The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (>15x), contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae, including genome sequencing, gene isolation, functional and comparative genomics. Because they have been constructed using the same tree (E. grandis BRASUZ1) whose full genome is being sequenced, they should prove instrumental for assembly and gap filling of the upcoming Eucalyptus reference genome sequence.

 

http://www.ncbi.nlm.nih.gov/pubmed/21375742