Text
E-book The Pangenome : Diversity, Dynamics and Evolution of Genomes
This open access book offers the first comprehensive account of the pan-genome concept and its manifold implications. The realization that the genetic repertoire of a biological species always encompasses more than the genome of each individual is one of the earliest examples of big data in biology that opened biology to the unbounded. The study of genetic variation observed within a species challenges existing views and has profound consequences for our understanding of the fundamental mechanisms underpinning bacterial biology and evolution. The underlying rationale extends well beyond the initial prokaryotic focus to all kingdoms of life and evolves into similar concepts for metagenomes, phenomes and epigenomes. The book’s respective chapters address a range of topics, from the serendipitous emergence of the pan-genome concept and its impacts on the fields of microbiology, vaccinology and antimicrobial resistance, to the study of microbial communities, bioinformatic applications and mathematical models that tie in with complex systems and economic theory. Given its scope, the book will appeal to a broad readership interested in population dynamics, evolutionary biology and genomics. The concept was to use theStreptococcus agalactiaegenome sequence informa-tion to predict proteins likely to be surface exposed and use these in experimentalassays for antigenicity and antibody accessibility toward the development of a GBSvaccine via active maternal immunization [for details on GBS reverse vaccinology,see Maione et al. (2005)].UnlikethecaseofNeisseria meningitidis, with which reverse vaccinology waspioneered right before the GBS project using a single genome, two GBS gap-freegenomes were available when the project was initiated, and more genomes were gener-ated early in the course of the project. Indeed, Tettelin et al. [TIGR (Tettelin et al.2002 )]and Glaser et al. [Pasteur Institute, France (Glaser et al.2002 )] independently reported thefirst two complete gap-free genome sequences of GBS in September of 2002.At that time, sequencing multiple strains or isolates of the same species was far fromcommonplace. Both strains, serotype V 2603 V/R and serotype III NEM316, were clinicalisolates. Glaser et al. compared their NEM316 genome to that ofStreptococcus pyogenes(group AStreptococcus, GAS) and concluded that 50% of the GBS genes without anortholog in GAS were located in 14 potential pathogenicity islands enriched in genesrelated to virulence and mobile elements. Tettelin et al. used a microarray-based compar-ative genomic hybridization (CGH) approach, whereby they hybridized the genomic DNA of each of 19 GBS isolates of various serotypes onto a microarray of spotted 2603V/R gene-specific amplicons, and identified several regions of genomic diversity amongGBS isolates, including between isolates of the same serotype (see Fig.2a).These separate studies provided thefirst evidence that a significant amount ofgenomic information or gene content was variable among closely related streptococcalisolates, challenging the commonly accepted notion that the genome of a single isolateof a given species was sufficient to represent the genomic content of that species. Basedon this understanding, the collaborative team decided to generate an additional 6 GBSgenomes (Tettelin et al.2005 ), selecting isolates from thefive major disease-causingserotypes known at the time. The genome of the serotype Ia strain A909 was sequencedto completion in collaboration with the group of Craig Rubens at Children’s Hospitaland Regional Medical Center, Seattle, WA, USA. The otherfivestrains—515 (serotypeIa), H36B (serotype Ib), 18RS21 (serotype II), COH1 (serotype III), and CJB111(serotype V)—were sequenced as draft genomes, i.e., no attempt was made to manuallyclose the gaps existing between contigs of the genome assemblies.1Comparison of theeight GBS whole-genome sequences confirmed the presence of the regions of genomicdiversity previously identified by CGH (see Fig.2b).Surprisingly for the time, the shared backbone, or core set of genes present ineach of the eight genomes, amounted to only about 80% of any individual genome’sgene coding potential. Within these eight genomes, there was no pair that was nearlyidentical. Instead, each genome contributed a significant number of new strain-specific genes not present in any of the other genomes sequenced. Other sets ofgenes were shared by some but not all of the genomes.This large amount of genomic diversity, which was not correlated to GBS sero-types, did not fail to stun members of the investigative team, including the experts inGBS biology. It also prompted an important question that formed the foundation ofthe pangenome concept:“How many genomes from isolates of the GBS species dowe need to sequence to be confident that we identified all of the genes that can beharbored by GBS as a whole?.
Tidak tersedia versi lain