Assembling Complex Plant Genomes

New, rapid, and low-cost approach can be applied to many species.

Image courtesy of Ian Britton under a Creative Commons license.
Cultivated barley is the fourth most abundant crop in the world and a model for plant genetics research.

The Science

Plant genomes are complex and challenging to study because of features such as polyploidy, polymorphisms, repeat content, and transposons that play important roles in the mechanisms driving plant evolution. Identified by assembly algorithms, contigs—overlapping DNA fragments that make up a genome—are difficult to assemble for plants because they are not easily linked together or even placed in their proper order. To address this problem, researchers have used next-generation sequencing to develop a new, low-cost, rapid, and effective method for assembling plant genome contigs.

The Impact

The new method, called population sequencing (POPSEQ), was applied to a large, complex, and highly-repetitive plant genome. Results were comparable to a previously assembled sequence using a more compute-intensive assembler, representing proof of principle and demonstrating that POPSEQ can be effectively applied to many species.

Summary

Scientists from the Department of Energy’s (DOE) Joint Genome Institute (JGI) teamed with other researchers to develop and test the POPSEQ approach on the barley genome. The plant was selected for DOE JGI’s 2011 Community Sequencing Program portfolio in part for its potential as a bioenergy feedstock crop. Grown on 4 million acres in the United States, barley straw could be used to produce cellulosic ethanol. More than 80% of the 5.1 billion-base barley genome is composed of repeats, adding to its complexity.

Using POPSEQ, researchers assembled the barley genome while testing a number of variables. For example, they used datasets obtained from different mapping populations, or in another case, assembled the genome based solely on short reads. The team reported that results from these tests were comparable with the assembly previously produced by the International Barley Sequencing Consortium. “By comparison,” they wrote, “POPSEQ is inexpensive, rapid, and conceptually simple, the most time-consuming step being the construction of a mapping population. The method is independent of the need for any prior sequence resources and will enable the rapid and cost-efficient establishment of powerful genomic information for many species.”

Contact

Nils Stein
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstr. 3, D-06466 Stadt Seeland (OT) Gatersleben, Germany
[email protected]

Robbie Waugh
The James Hutton Institute, Invergowrie, Dundee DD2 5DA and the University of Dundee, Division of Plant Sciences, Dundee DD1, UK
[email protected] or [email protected]

Funding

The work conducted by the U.S. Department of Energy (DOE) Joint Genome Institute is supported by the DOE Office of Science under contract no. DE-AC02-05CH11231. Additional funding support was provided by the Triticeae Coordinated Agricultural Project, U.S. Department of Agriculture’s National Institute of Food and Agriculture (grant no. 2011-68002-30029), the Scottish government Rural and Environment Science and Analytical Services Division Research Programme, and the German Ministry of Research and Education (BMBF TRITEX 0315954).

Publications

Mascher, M., et al., “Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ),” Plant J. 76 (4), 718–727 (2013). [DOI: 10.1111/tpj.12319].

Highlight Categories

Program: BER , BSSD

Performer: University , SC User Facilities , BER User Facilities , JGI

Additional: Collaborations , Non-DOE Interagency Collaboration , International Collaboration