Assembling Complex Plant Genomes
New, rapid, and low-cost approach can be applied to many species.
The Science
Plant genomes are complex and challenging to study because of features such as polyploidy, polymorphisms, repeat content, and transposons that play important roles in the mechanisms driving plant evolution. Identified by assembly algorithms, contigs—overlapping DNA fragments that make up a genome—are difficult to assemble for plants because they are not easily linked together or even placed in their proper order. To address this problem, researchers have used next-generation sequencing to develop a new, low-cost, rapid, and effective method for assembling plant genome contigs.
The Impact
The new method, called population sequencing (POPSEQ), was applied to a large, complex, and highly-repetitive plant genome. Results were comparable to a previously assembled sequence using a more compute-intensive assembler, representing proof of principle and demonstrating that POPSEQ can be effectively applied to many species.
Summary
Scientists from the Department of Energy’s (DOE) Joint Genome Institute (JGI) teamed with other researchers to develop and test the POPSEQ approach on the barley genome. The plant was selected for DOE JGI’s 2011 Community Sequencing Program portfolio in part for its potential as a bioenergy feedstock crop. Grown on 4 million acres in the United States, barley straw could be used to produce cellulosic ethanol. More than 80% of the 5.1 billion-base barley genome is composed of repeats, adding to its complexity.
Using POPSEQ, researchers assembled the barley genome while testing a number of variables. For example, they used datasets obtained from different mapping populations, or in another case, assembled the genome based solely on short reads. The team reported that results from these tests were comparable with the assembly previously produced by the International Barley Sequencing Consortium. “By comparison,” they wrote, “POPSEQ is inexpensive, rapid, and conceptually simple, the most time-consuming step being the construction of a mapping population. The method is independent of the need for any prior sequence resources and will enable the rapid and cost-efficient establishment of powerful genomic information for many species.”
Contact
Nils Stein
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstr. 3, D-06466 Stadt Seeland (OT) Gatersleben, Germany
[email protected]
Robbie Waugh
The James Hutton Institute, Invergowrie, Dundee DD2 5DA and the University of Dundee, Division of Plant Sciences, Dundee DD1, UK
[email protected] or [email protected]
Funding
The work conducted by the U.S. Department of Energy (DOE) Joint Genome Institute is supported by the DOE Office of Science under contract no. DE-AC02-05CH11231. Additional funding support was provided by the Triticeae Coordinated Agricultural Project, U.S. Department of Agriculture’s National Institute of Food and Agriculture (grant no. 2011-68002-30029), the Scottish government Rural and Environment Science and Analytical Services Division Research Programme, and the German Ministry of Research and Education (BMBF TRITEX 0315954).
Publications
Mascher, M., et al., “Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ),” Plant J. 76 (4), 718–727 (2013). [DOI: 10.1111/tpj.12319].
Highlight Categories
Performer: University , SC User Facilities , BER User Facilities , JGI
Additional: Collaborations , Non-DOE Interagency Collaboration , International Collaboration