|
|
|
|||
| Home Help Feedback Subscriptions Archive Search Table of Contents | ||||
First published online April 20, 2007
Journal of Experimental Biology 210, 1518-1525 (2007)
Published by The Company of Biologists 2007
doi: 10.1242/jeb.001370
Review Article |
Advanced sequencing technologies and their wider impact in microbiology
School of Biological Sciences, Biosciences Building, Crown Street, University of Liverpool, Liverpool L69 7ZB, UK
e-mail: neilhall{at}liv.ac.uk
Accepted 13 March 2007
| Summary |
|---|
|
|
|---|
Key words: comparative genomics, microbe, mutation screening, metagenomics, genome sequencing, bacteria, microorganism
| Introduction |
|---|
|
|
|---|
It is clear that genome sequencing has spearheaded a revolution in the biological sciences by allowing the study of molecular processes in the context of complete cellular systems, thus leading to the concept of `systems biology'. Genome sequence is also the foundation of the `omics' technologies such as proteomics and transcriptomics (such as microarrays). Despite its success, a casual observer of the genomics field might easily believe that there was no requirement for more genome sequencing, as almost all of the major model organisms and important human and animal pathogens have been sequenced. I will argue in this review that sequencing has yet to reach its full potential as a tool for discovery and hypothesis testing. I will draw upon three examples where the potential of new technologies has been, or soon will be, demonstrated: comparative genomics, mutation screening and metagenomics. I will start by describing briefly what the technologies are.
| Old and new sequencing technologies |
|---|
|
|
|---|
Recent developments in enzymology, imaging and microfluidics may offer a new approach to sequencing that could yield a massive increase in capacity while removing the need for the huge infrastructure required today. In this review, I will not give an exhaustive list of new technologies but I will describe a few of the published techniques that appear most promising. These can be separated into two approaches: sequencing with amplification and single-molecule sequencing. Fig. 1 gives an overview of some of the different sequencing strategies.
|
New technologies for sequencing with amplification
The first step in most sequencing processes is to amplify the DNA. This is
necessary because measuring biochemical processes at a single-molecule
resolution is so technically challenging. In the Sanger method, this is
usually done by cloning the DNA into a plasmid and growing clones; however,
this has its pitfalls as DNA is a biologically active molecule, hence there
are inherent biases against certain stretches of DNA that have physical
properties that do not replicate well in E. coli or that code for
toxic compounds. The two methods I will discuss here are the Margulies et al.
method (Margulies et al.,
2005
), also known as 454 sequencing after 454 Life Sciences
(Branford, CT, USA), which has commercialized it, and the Shendure et al.
method (Shendure et al.,
2005
), also known as polony sequencing
(Fig. 2). Both have developed
high-throughput strategies for in vitro amplification that are very
cheap and also get around the inherent biases of in vivo methods.
|
454 sequencing is, at the time of writing, the only new sequencing
technology that has been widely deployed. The 454 method is similar to the
polony method in that it involves massively parallel sequencing by synthesis
on a solid support. The method allows reads as long as 250 bp (and the maximum
read length is expected to increase further in the coming year) and is
therefore at least approaching the read lengths obtainable through traditional
methods. Margulies et al. have devised a scalable, highly parallel two-step
sequencing approach (Margulies et al.,
2005
). The first step involves shearing the genome and attachment
of oligonucleotides, a process that circumvents the need for generating a
clone library. Adapters are ligated to the fragments and these are bound to
beads and captured in the droplets of an oil-emulsion PCR reaction mixture.
PCR amplification in each droplet results in each bead carrying 10 million
copies of a unique DNA template. In the second step, a modified pyrosequencing
(Ronaghi et al., 1996
)
protocol is carried out, in which nucleotide incorporation is detected by the
release of inorganic pyrophosphate and the generation of photons.
Polony sequencing involves an in vitro library construction step that generates two paired genomic tags in a linear molecule separated by a universal linker and a universal tag on either end. Millions of these molecules are circularized using the linker ends and amplified in-parallel in a single reaction tube by a process of emulsion PCR using beads containing primers to the universal tags (very similar to the 454 method). The beads are then immobilized on a flow cell for sequencing. An unusual aspect of the polony technique is that it does not use primer extension replication for the sequencing stage but instead relies on the hybridization and ligation of oligonucleotides. First, an anchor primer is hybridized to one of the universal sequences, and then degenerate nonamers, which are labeled using fluorescent dyes, are hybridized to the template and then ligated to the anchor primer. The pools of nonamers are structured so that the base in the degenerate position corresponds to the color of the fluorescent dye labeling it. The nonamers will only ligate if the sequence is complementary to the bases adjacent to the anchor primer, therefore the sequence of the template can be derived. The sequence generated by this technique is very accurate and also benefits from having paired reads. A single run can generate around 30 Mb of sequence, with an estimated cost per kilobase of raw sequence that is 10-fold less than conventional sequencing. The disadvantage of this technique is the short read length, which is currently 26 bp per amplicon (13 bp per tag). The polony method has now been taken on by Applied Biosystems (Foster City, CA, USA). They have adapted the method so it is capable of 50 bp reads and generating >1 Mb of sequence in a single run. The technology (now named SOLiD) is expected to be brought to market in 2007.
Another method for massively parallel sequencing by synthesis from
amplified fragments has been recently developed by a company called Solexa
(Bennett, 2004
;
Bennett et al., 2005
). Solexa
sequencing differs from polony or 454 sequencing as it amplifies the DNA on a
solid surface followed by synthesis by incorporation of modified nucleotides
linked to colored dyes. Solexa sequencing will not be covered in depth here as
(at the time of writing) the methodology has not been published in detail.
However, as this review goes to press, Solexa have released their first
instrument that is capable of sequencing over 1 Gb in a single run and is
likely to have a major impact on the genomics field.
Single-molecule sequencing
Many of the problems, and inherent errors, of DNA sequencing result from
the fact that thousands or millions of amplified templates are assessed in a
single reaction. It would be far better to read DNA in the same way as cells
do; as single molecules. The first published report of single-molecule
sequencing was by the lab of Stephen Quake
(Braslavsky et al., 2003
). This
method involves hybridizing target DNA to complimentary primers that are
streptavidinbiotin bound to a silica surface. The primers are then
extended by the addition of Cy3- and Cy5-labeled nucleotides; as each base is
added, the incorporation is captured using a camera mounted on a microscope. A
limitation of this technology is that it generates short reads, which at the
time of publication was 5 bp; however, this technology has been taken up by a
company (Helicos Biosciences Corporation, Cambridge, MA, USA) who are
reporting much longer reads. This method is highly parallel, and on a 25 mm
square it would be possible to sequence 12 million templates simultaneously,
so, even with 5 bp reads, each `run' would generate 60 million bases of
information.
One other method of single-molecule sequencing that is in the very early
stages of development involves `reading' DNA as it is passed through a
nanopore (Kasianowicz et al.,
1996
; Storm et al.,
2005a
; Storm et al.,
2005b
). This would not involve an enzymatic extension reaction of
any kind but instead the physical properties of the molecule would be read as
the bases wind through a tiny pore. In theory, this method would have no limit
on read length and, hence, if the technical hurdles are overcome it could
revolutionize how genome sequencing is achieved.
| Read length, read quality and read pairs |
|---|
|
|
|---|
Currently, Sanger sequencing outperforms all of the new technologies in
these metrics of quality. Hence, efforts are underway to incorporate Sanger
sequencing data into 454 sequence assemblies to improve the consensus quality.
Because the reads and error distribution for new technologies are very
different from Sanger methods, the tools needed to process them and assemble
them are different. This means, frustratingly, that it is very difficult to
mix Sanger sequencing reads with other types of reads and assemble them
together, although some progress has been made in this direction
(Goldberg et al., 2006
;
Wicker et al., 2006
).
| Comparative genomics: the need for more de novo genome sequencing |
|---|
|
|
|---|
For a few pathogenic microbes, multiple species have been sequenced, and
the data from these studies have revealed that a single reference genome,
while useful, may only give a snapshot of the genetic makeup of a species. A
recent study of group B Streptococcus strains
(Tettelin et al., 2005
)
revealed that, as each new strain was sequenced, new genes were discovered
such that, after sequencing eight genomes, approximately 33 novel genes were
discovered from each additional genome. This has led to the concept of the
`Pan-genome', which refers to the full gene repertoire contained within a
species. The Pan-genome theory predicts that any bacterial species will be
made up of a core set of genes that is found in all individuals and a
dispensable set of genes that may or may not be present in any particular
individual (Medini et al.,
2005
; Tettelin et al.,
2005
). This phenomenon seems to be applicable to most other
microorganisms examined, and subtractive hybridization studies of E.
coli suggest that up to 25% of the genome is specific to individual
strains (Fukiya et al., 2004
).
By sequencing more and more individuals, the scale of the Pan-genome can be
estimated. So, for Bacillus anthracis, no more new genes were
identified after four species were sequenced whereas for group B
Streptococcus and E. coli it is estimated that the number of
strains needed to survey the Pan-genome is at least in the hundreds and
effectively may be infinite. An important finding from this work is that for
many species, the dispensable gene set may be significantly larger than the
core genome. Therefore, a single genome may give a very poor representation of
the genetic potential of the species. When predicting the chance of emergence
of drug resistance or new virulent forms of pathogens, knowledge of the
complete genetic complement of the species is far more important than the
genetic complement of an individual.
Not only do more genomes allow for the discovery of more genes but they
also help us to understand how genes and genomes are evolving, as this can
provide clues to gene function. Pathogen genes that are interacting with the
host are often subject to positive selection (and therefore appear to be
evolving rapidly). Genome-wide molecular evolution studies have been applied
to various pathogens such as Plasmodium
(Hall et al., 2005
),
Trypanosoma (El-Sayed et al.,
2005
), Borrelia (Qiu
et al., 2004
) and many other species. These studies depend on
tracing the pattern of mutations that occur in synonymous and non-synonymous
sites by aligning orthologous genes in closely related species. The more
genomes that can be aligned, the more accurate this analysis is. The studies
to date have used up to four genomes at a time but as sequencing becomes more
affordable it will be possible to scale this analysis up to look at tens or
hundreds of genomes at a time.
| Mutation screening |
|---|
|
|
|---|
One of the most obvious applications of cheaper, more high-throughput genome sequencing of microbes is for mutation screening. This may be carried out at the population level, to identify associations between phenotypes and genotypes, or in lab-generated strains, to identify SNPs or larger mutations that have given rise to selected phenotypes. Currently, there are a number of platforms that allow SNP screening using microarrays but these require the array to be pre-designed and they will not resolve large genomic changes such as insertions or inversions relative to the reference sequence. Recent work on experimentally evolved species has demonstrated how new sequencing methods can be used to track mutations that have been acquired in the laboratory.
Shendure et al. used polony sequencing to screen an evolved strain of an
E. coli auxotroph (Shendure et
al., 2005
). The sequencing was able to identify a number of SNPs
as well as larger deletions and inversions. This work demonstrated that,
despite the small amount of data obtained per clone (26 bp), it was possible
to identify large-scale rearrangements in the genome and align fragments to
identify SNPs. In a similar study of the cooperative bacterium Myxococcus
xanthus (Velicer et al.,
2006
), a laboratory-evolved strain that had been selected for a
cheating phenotype and reselected for a cooperative phenotype was shotgun
sequenced using 454 sequencing technology. The 454 sequence was able to
identify point mutations in the evolved strain compared with the reference
strain, which could then be associated with the changes in phenotype (as well
as identifying errors in the reference).
While whole-genome sequencing may still be prohibitively expensive for detection of point mutations, we may expect prices to fall for these new technologies, as they have in the past for Sanger sequencing. Due to their small genome size, microbes will be in the first wave of organisms to be studied this way and we can expect direct whole-genome sequencing to replace many other forward genetic techniques for the study of very specific traits.
| Metagenomics |
|---|
|
|
|---|
Metagenomic studies have been applied already to human environments such as
the human gut (Breitbart et al.,
2003
; Gill et al.,
2006
; Manichanh et al.,
2006
; Zhang et al.,
2006
), environmental samples such as soil
(Bertrand et al., 2005
;
Lim et al., 2005
;
Mills et al., 2006
) and the
ocean (Breitbart et al., 2004
;
Culley et al., 2006
;
DeLong et al., 2006
;
Sogin et al., 2006
;
Venter et al., 2004
). These
studies have provided interesting findings in terms of the metabolic
capability and taxonomic diversity of the microbes inhabiting these
environments. The major goal of these metagenomic studies is not only to find
new biological species and systems but also to be able to identify biomarkers
that can be used to classify the type of processes that occur in specific
environments. For example, what processes and species are more commonly found
in a diseased gut compared with a healthy one? Or which species or processes
associate with polluted as opposed to pristine environments?
A major problem with this preliminary work is that the diversity is
probably not fully sampled because of the complexity of the environments
studied. It has been recently estimated that close to 107 distinct
bacterial species inhabit a 10 g soil sample
(Curtis and Sloan, 2005
;
Curtis et al., 2002
;
Gans et al., 2005
); this is a
species diversity two orders of magnitude higher than previous estimates. If
each of these species had an average genome size of 35 Mb, this would
mean that a single sample would contain the equivalent of 1000 human genomes.
Even if the species were present in equal amounts then a large sequencing
center would have to dedicate its entire resource for years to sample all of
the genomes present. Unfortunately, the problem is still more complex than
that; the new higher estimate is based on the finding that there is greater
diversity in the low-abundance species that are masked by a less diverse group
of high-abundance species. Hence, current studies only scrape the surface of
the full diversity and most of the low-abundance species in the environments
are not sampled at all. New highly parallel sequencing technologies offer a
cost-effective solution to this problem as they can generate much more
sequence than traditional methods. However, there are limitations to their
utility because non-Sanger methods have shorter read lengths and are therefore
more difficult to assemble. Two recent studies using 454 pyrosequencing have
demonstrated the power of new sequencing technologies for this type of
analysis: one analyzing the massive diversity in the oceans
(Sogin et al., 2006
) and the
other analysing a low-complexity environment
(Edwards et al., 2006
).
The first study set out to measure the number of species in the Earth's
ocean biosphere by using massively parallel sequencing to sufficiently sample
the low-abundance taxa in order to make more accurate estimate of their
diversity (Sogin et al.,
2006
). Using the 454 pyrosequencing technology, 118 000 amplicons
were sequenced that spanned the V6 hypervariable region of the ribosomal RNA
(rRNA) from bacteria collected at different depths and locations of the
Atlantic and Pacific oceans. The resulting sequences were compared to a
database of all known V6 regions in order to place them phylogenetically.
Clustering of these sequences defined Operational Taxonomic Units (OTUs). In
each sample, over 1000 OTUs were identified, and in the most sampled
environment over 3000 OTUs were identified. In no environment did rarefaction
analysis suggest that the sampling had reached a plateau, as the number of
OTUs identified increased almost linearly with the sequencing of new tags.
Although the authors of this study made specific efforts to control for
sequencing errors, it is possible that some of the diversity observed was
caused by the inherent base calling errors that occur in 454 sequencing reads,
and the findings of this study should therefore be verified by other methods.
Although this study was insufficient for measuring diversity, it still
demonstrated the inadequacy of other methods and will increase estimates of
natural diversity further.
In the second study, two water samples from adjacent sites that differed
significantly in their chemistry and geology were analyzed
(Edwards et al., 2006
). 454
sequencing was used to generate random sequence from each sample. Over 35 Mb
of sequence was generated from both samples in short reads and therefore the
challenge was to be able to analyze these data to identify processes and
taxonomic groups that would allow a comparison of the microbial diversity in
the two environments. The 16S reads that were present in the sample were used
to identify the species present; this demonstrated that the oxygenated
environment had a much higher species diversity than the oxygen-poor
environment. This result was verified by using Sanger sequencing of an rRNA
library from each sample.
In addition to looking at species, Edwards et al. also analyzed the
metabolic potential of the different communities by automatically assessing
gene function by homology searches of sequence reads against a metabolic
database (Edwards et al.,
2006
). Using this analysis they identified processes that were
significantly overrepresented in one sample relative to the other. This study
was able to focus on biological processes as well as diversity, as the
environments in question were far less complex that the ocean environment
studies by Sogin et al. (Sogin et al.,
2006
). However, as the technologies used become faster and
cheaper, it may be possible to deeply sequence complex environments. These
studies are not only limited by sequencing, however, and there will need to be
improvements in genomic assembly and annotation in order to analyze the data
generated.
| Conclusion |
|---|
|
|
|---|
Importantly, these technologies will enable researchers to undertake the process of genomic sequencing in a single operation using bench-top instruments. This will democratize a technology that, until now, has largely been the preserve of large genome centers. It is hoped that once this process can be viewed as an assay in the same way that we view a microarray experiment whole-genome sequencing will be applied to a host of new questions, such as genotype association studies, mutation screening, evolutionary studies and environmental profiling.
It may be that the term `post-genomics' has been prematurely inserted into the scientific lexicon and we are in fact on the cusp of a genome sequencing renaissance.
| Acknowledgments |
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
Bennett, S. (2004). Solexa Ltd. Pharmacogenomics 5,433 -438.[CrossRef][Medline]
Bennett, S. T., Barnes, C., Cox, A., Davies, L. and Brown, C. (2005). Toward the 1,000 dollars human genome. Pharmacogenomics 6,373 -382.[Medline]
Bertrand, H., Poly, F., Van, V. T., Lombard, N., Nalin, R., Vogel, T. M. and Simonet, P. (2005). High molecular weight DNA recovery from soils prerequisite for biotechnological metagenomic library construction. J. Microbiol. Methods 62, 1-11.[CrossRef][Medline]
Blattner, F. R., Plunkett, G., III, Bloch, C. A., Perna, N. T.,
Burland, V., Riley, M., Collado-Vides, J., Glasner, J. D., Rode, C. K.,
Mayhew, G. F. et al. (1997). The complete genome sequence of
Escherichia coli K-12. Science
277,1453
-1474.
Braslavsky, I., Hebert, B., Kartalov, E. and Quake, S. R.
(2003). Sequence information can be obtained from single DNA
molecules. Proc. Natl. Acad. Sci. USA
100,3960
-3964.
Breitbart, M., Hewson, I., Felts, B., Mahaffy, J. M., Nulton,
J., Salamon, P. and Rohwer, F. (2003). Metagenomic analyses
of an uncultured viral community from human feces. J.
Bacteriol. 185,6220
-6223.
Breitbart, M., Felts, B., Kelley, S., Mahaffy, J. M., Nulton,
J., Salamon, P. and Rohwer, F. (2004). Diversity and
population structure of a near-shore marine-sediment viral community.
Proc. Biol. Sci. 271,565
-574.
Brenner, S., Johnson, M., Bridgham, J., Golda, G., Lloyd, D. H., Johnson, D., Luo, S., McCurdy, S., Foy, M., Ewan, M. et al. (2000). Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18,630 -634.[CrossRef][Medline]
Cole, S. T., Brosch, R., Parkhill, J., Garnier, T., Churcher, C., Harris, D., Gordon, S. V., Eiglmeier, K., Gas, S., Barry, C. E., III et al. (1998). Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393,537 -544.[CrossRef][Medline]
Culley, A. I., Lang, A. S. and Suttle, C. A.
(2006). Metagenomic analysis of coastal RNA virus communities.
Science 312,1795
-1798.
Curtis, T. P. and Sloan, W. T. (2005).
Microbiology. Exploring microbial diversity a vast below.
Science 309,1331
-1333.
Curtis, T. P., Sloan, W. T. and Scannell, J. W.
(2002). Estimating prokaryotic diversity and its limits.
Proc. Natl. Acad. Sci. USA
99,10494
-10499.
DeLong, E. F., Preston, C. M., Mincer, T., Rich, V., Hallam, S.
J., Frigaard, N. U., Martinez, A., Sullivan, M. B., Edwards, R., Brito, B. R.
et al. (2006). Community genomics among stratified microbial
assemblages in the ocean's interior. Science
311,496
-503.
Edwards, R. A., Rodriguez-Brito, B., Wegley, L., Haynes, M., Breitbart, M., Peterson, D. M., Saar, M. O., Alexander, S., Alexander, E. C., Jr and Rohwer, F. (2006). Using pyrosequencing to shed light on deep mine microbial ecology. BMC Genomics 7, 57.[CrossRef][Medline]
El-Sayed, N. M., Myler, P. J., Blandin, G., Berriman, M.,
Crabtree, J., Aggarwal, G., Caler, E., Renauld, H., Worthey, E. A.,
Hertz-Fowler, C. et al. (2005). Comparative genomics of
trypanosomatid parasitic protozoa. Science
309,404
-409.
Fleischmann, R. D., Adams, M. D., White, O., Clayton, R. A.,
Kirkness, E. F., Kerlavage, A. R., Bult, C. J., Tomb, J. F., Dougherty, B. A.,
Merrick, J. M. et al. (1995). Whole-genome random sequencing
and assembly of Haemophilus influenzae Rd.
Science 269,496
-512.
Fukiya, S., Mizoguchi, H., Tobe, T. and Mori, H.
(2004). Extensive genomic diversity in pathogenic Escherichia
coli and Shigella strains revealed by comparative genomic
hybridization microarray. J. Bacteriol.
186,3911
-3921.
Gans, J., Wolinsky, M. and Dunbar, J. (2005).
Computational improvements reveal great bacterial diversity and high metal
toxicity in soil. Science
309,1387
-1390.
Gardner, M. J., Hall, N., Fung, E., White, O., Berriman, M., Hyman, R. W., Carlton, J. M., Pain, A., Nelson, K. E., Bowman, S. et al. (2002a). Genome sequence of the human malaria parasite Plasmodium falciparum. Nature 419,498 -511.[CrossRef][Medline]
Gardner, M. J., Shallom, S. J., Carlton, J. M., Salzberg, S. L., Nene, V., Shoaibi, A., Ciecko, A., Lynn, J., Rizzo, M., Weaver, B. et al. (2002b). Sequence of Plasmodium falciparum chromosomes 2, 10, 11 and 14. Nature 419,531 -534.[CrossRef][Medline]
Gill, S. R., Pop, M., Deboy, R. T., Eckburg, P. B., Turnbaugh,
P. J., Samuel, B. S., Gordon, J. I., Relman, D. A., Fraser-Liggett, C. M. and
Nelson, K. E. (2006). Metagenomic analysis of the human
distal gut microbiome. Science
312,1355
-1359.
Goffeau, A., Aert, R., Agostini-Carbone, M. L., Ahmed, A., Aigle, M., Alberghina, L., Albermann, K., Albers, M., Aldea, M., Alexandraki, D. et al. (1997). The yeast genome directory. Nature 387(Suppl.),1 -105.
Goldberg, S. M., Johnson, J., Busam, D., Feldblyum, T.,
Ferriera, S., Friedman, R., Halpern, A., Khouri, H., Kravitz, S. A., Lauro, F.
M. et al. (2006). A Sanger/pyrosequencing hybrid approach for
the generation of high-quality draft assemblies of marine microbial genomes.
Proc. Natl. Acad. Sci. USA
103,11240
-11245.
Hall, N., Pain, A., Berriman, M., Churcher, C., Harris, B., Harris, D., Mungall, K., Bowman, S., Atkin, R., Baker, S. et al. (2002). Sequence of Plasmodium falciparum chromosomes 1, 3-9 and 13. Nature 419,527 -531.[CrossRef][Medline]
Hall, N., Karras, M., Raine, J. D., Carlton, J. M., Kooij, T.
W., Berriman, M., Florens, L., Janssen, C. S., Pain, A., Christophides, G. K.
et al. (2005). A comprehensive survey of the
Plasmodium life cycle by genomic, transcriptomic, and proteomic
analyses. Science 307,82
-86.
Hyman, R. W., Fung, E., Conway, A., Kurdi, O., Mao, J., Miranda, M., Nakao, B., Rowley, D., Tamaki, T., Wang, F. et al. (2002). Sequence of Plasmodium falciparum chromosome 12. Nature 419,534 -537.[CrossRef][Medline]
Jurinke, C., van den Boom, D., Cantor, C. R. and Koster, H. (2002). The use of MassARRAY technology for high throughput genotyping. Adv. Biochem. Eng. Biotechnol. 77, 57-74.[Medline]
Kasianowicz, J. J., Brandin, E., Branton, D. and Deamer, D.
W. (1996). Characterization of individual polynucleotide
molecules using a membrane channel. Proc. Natl. Acad. Sci.
USA 93,13770
-13773.
Klenk, H. P., Clayton, R. A., Tomb, J. F., White, O., Nelson, K. E., Ketchum, K. A., Dodson, R. J., Gwinn, M., Hickey, E. K., Peterson, J. D. et al. (1997). The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus.Nature 390,364 -370.[CrossRef][Medline]
Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W. et al. (2001). Initial sequencing and analysis of the human genome. Nature 409,860 -921.[CrossRef][Medline]
Lim, H. K., Chung, E. J., Kim, J. C., Choi, G. J., Jang, K. S.,
Chung, Y. R., Cho, K. Y. and Lee, S. W. (2005).
Characterization of a forest soil metagenome clone that confers indirubin and
indigo production on Escherichia coli. Appl. Environ.
Microbiol. 71,7768
-7777.
Madabhushi, R. S. (1998). Separation of 4-color DNA sequencing extension products in noncovalently coated capillaries using low viscosity polymer solutions. Electrophoresis 19,224 -230.[CrossRef][Medline]
Manichanh, C., Rigottier-Gois, L., Bonnaud, E., Gloux, K.,
Pelletier, E., Frangeul, L., Nalin, R., Jarrin, C., Chardon, P., Marteau, P.
et al. (2006). Reduced diversity of faecal microbiota in
Crohn's disease revealed by a metagenomic approach.
Gut 55,205
-211.
Margulies, M., Egholm, M., Altman, W. E., Attiya, S., Bader, J. S., Bemben, L. A., Berka, J., Braverman, M. S., Chen, Y. J., Chen, Z. et al. (2005). Genome sequencing in microfabricated high-density picolitre reactors. Nature 437,376 -380.[Medline]
Medini, D., Donati, C., Tettelin, H., Masignani, V. and Rappuoli, R. (2005). The microbial pan-genome. Curr. Opin. Genet. Dev. 15,589 -594.[CrossRef][Medline]
Mikkelsen, T., Hiller, L. W., Eichler, E. E., Zody, M. C., Jaffe, D. B., Yang, S. P., Enard, W., Hellmannm, I., Linbald-toh, K. and Altheide, T. K. (2005). Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69-87.[CrossRef][Medline]
Mills, D. K., Entry, J. A., Voss, J. D., Gillevet, P. M. and Mathee, K. (2006). An assessment of the hypervariable domains of the 16S rRNA genes for their value in determining microbial community diversity: the paradox of traditional ecological indices. FEMS Microbiol. Ecol. 57,496 -503.[CrossRef][Medline]
Prober, J. M., Trainor, G. L., Dam, R. J., Hobbs, F. W.,
Robertson, C. W., Zagursky, R. J., Cocuzza, A. J., Jensen, M. A. and
Baumeister, K. (1987). A system for rapid DNA sequencing with
fluorescent chain-terminating dideoxynucleotides.
Science 238,336
-341.
Qiu, W. G., Schutzer, S. E., Bruno, J. F., Attie, O., Xu, Y.,
Dunn, J. J., Fraser, C. M., Casjens, S. R. and Luft, B. J.
(2004). Genetic exchange and plasmid transfers in Borrelia
burgdorferi sensu stricto revealed by three-way genome comparisons and
multilocus sequence typing. Proc. Natl. Acad. Sci. USA
101,14150
-14155.
Ronaghi, M., Karamohamed, S., Pettersson, B., Uhlen, M. and Nyren, P. (1996). Real-time DNA sequencing using detection of pyrophosphate release. Anal. Biochem. 242, 84-89.[CrossRef][Medline]
Sanger, F., Nicklen, S. and Coulson, A. R.
(1977). DNA sequencing with chain-terminating inhibitors.
Proc. Natl. Acad. Sci. USA
74,5463
-5467.
Shendure, J., Porreca, G. J., Reppas, N. B., Lin, X.,
McCutcheon, J. P., Rosenbaum, A. M., Wang, M. D., Zhang, K., Mitra, R. D. and
Church, G. M. (2005). Accurate multiplex polony sequencing of
an evolved bacterial genome. Science
309,1728
-1732.
Smith, L. M., Sanders, J. Z., Kaiser, R. J., Hughes, P., Dodd, C., Connell, C. R., Heiner, C., Kent, S. B. and Hood, L. E. (1986). Fluorescence detection in automated DNA sequence analysis. Nature 321,674 -679.[CrossRef][Medline]
Sogin, M. L., Morrison, H. G., Huber, J. A., Welch, D. M., Huse,
S. M., Neal, P. R., Arrieta, J. M. and Herndl, G. J. (2006).
Microbial diversity in the deep sea and the underexplored "rare
biosphere". Proc. Natl. Acad. Sci. USA
103,12115
-12120.
Storm, A. J., Chen, J. H., Zandbergen, H. W. and Dekker, C. (2005a). Translocation of double-strand DNA through a silicon oxide nanopore. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 71,051903 .[Medline]
Storm, A. J., Storm, C., Chen, J., Zandbergen, H., Joanny, J. F. and Dekker, C. (2005b). Fast DNA translocation through a solid-state nanopore. Nano. Lett. 5,1193 -1197.[CrossRef][Medline]
Tettelin, H., Masignani, V., Cieslewicz, M. J., Donati, C.,
Medini, D., Ward, N. L., Angiuoli, S. V., Crabtree, J., Jones, A. L., Durkin,
A. S. et al. (2005). Genome analysis of multiple pathogenic
isolates of Streptococcus agalactiae: implications for the microbial
"pan-genome". Proc. Natl. Acad. Sci. USA
102,13950
-13955.
Velicer, G. J., Raddatz, G., Keller, H., Deiss, S., Lanz, C.,
Dinkelacker, I. and Schuster, S. C. (2006). Comprehensive
mutation identification in an evolved bacterial cooperator and its cheating
ancestor. Proc. Natl. Acad. Sci. USA
103,8107
-8112.
Venter, J. C., Remington, K., Heidelberg, J. F., Halpern, A. L.,
Rusch, D., Eisen, J. A., Wu, D., Paulsen, I., Nelson, K. E., Nelson, W. et
al. (2004). Environmental genome shotgun sequencing of the
Sargasso Sea. Science
304, 66-74.
Waterston, R. H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J. F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P. et al. (2002). Initial sequencing and comparative analysis of the mouse genome. Nature 420,520 -562.[CrossRef][Medline]
Wicker, T., Schlagenhauf, E., Graner, A., Close, T. J., Keller, B. and Stein, N. (2006). 454 sequencing put to the test using the complex genome of barley. BMC Genom. 7, 275.[CrossRef]
Zhang, T., Breitbart, M., Lee, W. H., Run, J. Q., Wei, C. L., Soh, S. W., Hibberd, M. L., Liu, E. T., Rohwer, F. and Ruan, Y. (2006). RNA viral community in human feces: prevalence of plant pathogenic viruses. PLoS Biol. 4, e3.[CrossRef][Medline]
![]()
CiteULike
Complore
Connotea
Del.icio.us
Digg
Reddit
Technorati
Twitter What's this?
Related articles in JEB:
This article has been cited by other articles:
![]() |
P. J. Bickel, J. B. Brown, H. Huang, and Q. Li An overview of recent developments in genomics and associated statistical methods Phil Trans R Soc A, November 13, 2009; 367(1906): 4313 - 4337. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Skolnick and M. Brylinski FINDSITE: a combined evolution/structure-based approach to protein function prediction Brief Bioinform, July 1, 2009; 10(4): 378 - 391. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Foster, S. M. Beckstrom-Sternberg, T. Pearson, J. S. Beckstrom-Sternberg, P. S. G. Chain, F. F. Roberto, J. Hnath, T. Brettin, and P. Keim Whole-Genome-Based Phylogeny and Divergence of the Genus Brucella J. Bacteriol., April 15, 2009; 191(8): 2864 - 2870. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Miller, A. L. Delcher, S. Koren, E. Venter, B. P. Walenz, A. Brownley, J. Johnson, K. Li, C. Mobarry, and G. Sutton Aggressive assembly of pyrosequencing reads with mates Bioinformatics, December 15, 2008; 24(24): 2818 - 2824. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Noguchi, T. Taniguchi, and T. Itoh MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes DNA Res, December 1, 2008; 15(6): 387 - 396. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Montgomerie, J. A. Cruz, S. Shrivastava, D. Arndt, M. Berjanskii, and D. S. Wishart PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation Nucleic Acids Res., July 1, 2008; 36(suppl_2): W202 - W209. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||