- Initial sequencing and analysis of the human genome
E S Lander
Whitehead Institute for Biomedical Research, Center for Genome Research, Cambridge, Massachusetts 02142, USA
Nature 409:860-921. 2001
..We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence...
- MatInspector and beyond: promoter analysis based on transcription factor binding sites
Genomatix Software GmbH Landsberger Strasse 6, 80339 München, Germany
Bioinformatics 21:2933-42. 2005
..The next steps in promoter analysis can be tackled only with reliable predictions, e.g. finding phylogenetically conserved patterns or identifying higher order combinations of sites in promoters of co-regulated genes...
- TM4: a free, open-source system for microarray data management and analysis
A I Saeed
Institute for Genomic Research, Rockville, MD, USA
Biotechniques 34:374-8. 2003
- A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
B M Bolstad
Group in Biostatistics, University of California, Berkeley, CA 94720, USA
Bioinformatics 19:185-93. 2003
..Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations...
- RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models
Swiss Federal Institute of Technology Lausanne, School of Computer and Communication Sciences Lab Prof Moret, Station 14, CH 1015 Lausanne, Switzerland
Bioinformatics 22:2688-90. 2006
..The program has been used to compute ML trees on two of the largest alignments to date containing 25,057 (1463 bp) and 2182 (51,089 bp) taxa, respectively...
- BEAST: Bayesian evolutionary analysis by sampling trees
Alexei J Drummond
Bioinformatics Institute, University of Auckland, Auckland, New Zealand
BMC Evol Biol 7:214. 2007
..A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented...
- MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0
Center for Evolutionary Functional Genomics, The Biodesign Institute, Arizona State University, AZ, USA
Mol Biol Evol 24:1596-9. 2007
..The current version of MEGA is available free of charge at (http://www.megasoftware.net)...
- Velvet: algorithms for de novo short read assembly using de Bruijn graphs
Daniel R Zerbino
EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
Genome Res 18:821-9. 2008
..Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies...
- Haploview: analysis and visualization of LD and haplotype maps
J C Barrett
Whitehead Institute for Biomedical Research Cambridge, MA 02142, USA
Bioinformatics 21:263-5. 2005
..Haploview is a software package that provides computation of linkage disequilibrium statistics and population haplotype patterns from primary genotype data in a visually appealing and interactive interface...
- CAP3: A DNA sequence assembly program
Department of Computer Science, Michigan Technological University, Houghton, Michigan 49931 USA
Genome Res 9:868-77. 1999
..PHRAP often produces longer contigs than CAP3 whereas CAP3 often produces fewer errors in consensus sequences than PHRAP. It is easier to construct scaffolds with CAP3 than with PHRAP on low-pass data with forward-reverse constraints...
- Tandem repeats finder: a program to analyze DNA sequences
Department of Biomathematical Sciences, Mount Sinai School of Medicine, New York, NY 10029 6574, USA
Nucleic Acids Res 27:573-80. 1999
..These sequences range in size from 3 kb up to 700 kb. A World Wide Web server interface atc3.biomath.mssm.edu/trf.html has been established for automated use of the program...
- Fast and accurate short read alignment with Burrows-Wheeler transform
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK
Bioinformatics 25:1754-60. 2009
..The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals...
- Genome sequencing in microfabricated high-density picolitre reactors
454 Life Sciences Corp, 20 Commercial Street, Branford, Connecticut 06405, USA
Nature 437:376-80. 2005
..Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine...
- Artemis: sequence visualization and annotation
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
Bioinformatics 16:944-5. 2000
..Sequences and annotation can be read and written directly in EMBL, GenBank and GFF format. AVAILABITLTY: Artemis is available under the GNU General Public License from http://www.sanger.ac.uk/Software/Artemis..
- MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences
Biodesign Institute, A240, Arizona State University, Tempe, AZ 85287 5301, USA
Brief Bioinform 9:299-306. 2008
..We also discuss how MEGA might evolve in the future to assist researchers in their growing need to analyze large data set using new computational methods...
- InterProScan: protein domains identifier
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
Nucleic Acids Res 33:W116-20. 2005
..O'Reilly Publishers, Sebastopol, CA, http://www.w3.org/TR/soap/] are also available to the users. Various output formats are supported and include text tables, XML documents, as well as various graphs to help interpret the results...
- ARB: a software environment for sequence data
Lehrstuhl fur Mikrobiologie, Technische Universitat Munchen, D 853530 Freising, Germany
Nucleic Acids Res 32:1363-71. 2004
..Currently, the package is used by numerous working groups worldwide...
- SOAP: short oligonucleotide alignment program
Beijing Genomics Institute at Shenzhen, Shenzhen 518083, China
Bioinformatics 24:713-4. 2008
..SOAP is a command-driven program, which supports multi-threaded parallel computing, and has a batch module for multiple query sets...
- ACT: the Artemis Comparison Tool
Tim J Carver
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
Bioinformatics 21:3422-3. 2005
..ACT is part of the Artemis distribution and is similarly open source, written in Java and can run on any Java enabled platform, including UNIX, Macintosh and Windows...
- Microbial diversity in the deep sea and the underexplored "rare biosphere"
Mitchell L Sogin
Josephine Bay Paul Center, Marine Biological Laboratory at Woods Hole, 7 MBL Street, Woods Hole, MA 02543, USA
Proc Natl Acad Sci U S A 103:12115-20. 2006
..Members of the rare biosphere are highly divergent from each other and, at different times in earth's history, may have had a profound impact on shaping planetary processes...
- The consensus coding sequences of human breast and colorectal cancers
Ludwig Center and Howard Hughes Medical Institute, Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, Baltimore, MD 21231, USA
Science 314:268-74. 2006
..These data define the genetic landscape of two human cancer types, provide new targets for diagnostic and therapeutic intervention, and open fertile avenues for basic research in tumor biology...
- An obesity-associated gut microbiome with increased capacity for energy harvest
Peter J Turnbaugh
Center for Genome Sciences, Washington University, St Louis, Missouri 63108, USA
Nature 444:1027-31. 2006
..These results identify the gut microbiota as an additional contributing factor to the pathophysiology of obesity...
- The complete genome sequence of Escherichia coli K-12
F R Blattner
Laboratory of Genetics, University of Wisconsin Madison, 445 Henry Mall, Madison, WI 53706, USA
Science 277:1453-62. 1997
..The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer...
- Identifying bacterial genes and endosymbiont DNA with Glimmer
Arthur L Delcher
Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA
Bioinformatics 23:673-9. 2007
..This module was developed in response to the discovery that eukaryotic genome sequencing projects sometimes inadvertently capture the DNA of intracellular bacteria living in the host...
- Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence
S T Cole
Sanger Centre, Wellcome Trust Genome Campus, Hinxton, UK
Nature 393:537-44. 1998
- PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments
European Molecular Biology Laboratory, Meyerhofstrasse 1, D 69117 Heidelberg, Germany
Nucleic Acids Res 34:W609-12. 2006
..Another distinct feature is that the user can specify a subregion of the input alignment in order to specifically analyze functional domains or exons of interest. The PAL2NAL server is available at http://www.bork.embl.de/pal2nal...
- Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing
J Cristobal Vera
Department of Biology, 208 Mueller Laboratory, Pennsylvania State University, University Park, PA 16802, USA
Mol Ecol 17:1636-47. 2008
..This development narrows the gap between approaches based on model organisms with rich genetic resources vs. species that are most tractable for ecological and evolutionary studies...
- Accuracy and quality of massively parallel DNA pyrosequencing
Susan M Huse
Josephine Bay Paul Center, Marine Biological Laboratory at Woods Hole, MBL Street, Woods Hole, MA 02543, USA
Genome Biol 8:R143. 2007
- Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms
Peter A C 't Hoen
The Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands
Nucleic Acids Res 36:e141. 2008
..We conclude that deep sequencing provides a major advance in robustness, comparability and richness of expression profiling data and is expected to boost collaborative, comparative and integrative genomics studies...
- TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets
The Institute for Genomic Research, Rockville, MD 20850, USA
Bioinformatics 19:651-2. 2003
..The system can run on multi-CPU architectures including SMP and PVM...
- The human microbiome project
Peter J Turnbaugh
Center for Genome Sciences, Washington University School of Medicine, St Louis, Missouri 63108, USA
Nature 449:804-10. 2007
..A strategy to understand the microbial components of the human genetic and metabolic landscape and how they contribute to normal physiology and predisposition to disease...
- Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
Juliane C Dohm
Max Planck Institute for Molecular Genetics, Ihnestr 63 73, 14195 Berlin, Germany
Nucleic Acids Res 36:e105. 2008
..Such biases have implications on the use and interpretation of Solexa data, for de novo sequencing, re-sequencing, the identification of single nucleotide polymorphisms and DNA methylation sites, as well as for transcriptome analysis...
- Phylogeny.fr: robust phylogenetic analysis for the non-specialist
Information Génomique et Structurale IGS, CNRS UPR2589, IFR 88, Marseille, France
Nucleic Acids Res 36:W465-9. 2008
..A guide tree then helps to select neighbor sequences to be used as input for the phylogeny pipeline. Phylogeny.fr is available at: http://www.phylogeny.fr/..
- High-throughput functional annotation and data mining with the Blast2GO suite
Bioinformatics Department, Centro de Investigacion Principe Felipe, Valencia, Spain
Nucleic Acids Res 36:3420-35. 2008
..Our aim is to offer biologists useful information to take into account when addressing the task of functionally characterizing their sequence data...
- Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning
Shawn J Cokus
Department of Molecular, Cell, and Developmental Biology, University of California at Los Angeles, Los Angeles, California 90095, USA
Nature 452:215-9. 2008
- Genome sequence of Aedes aegypti, a major arbovirus vector
Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA
Science 316:1718-23. 2007
..An increase in genes encoding odorant binding, cytochrome P450, and cuticle domains relative to An. gambiae suggests that members of these protein families underpin some of the biological differences between the two mosquito species...
- The genome of the African trypanosome Trypanosoma brucei
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
Science 309:416-22. 2005
..brucei and the greatest in L. major. Horizontal transfer of genes of bacterial origin has contributed to some of the metabolic differences in these parasites, and a number of novel potential drug targets have been identified...
- The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla
Genoscope CEA and UMR 8030 CNRS Genoscope Université d Evry, 2 rue Gaston Cremieux, BP5706, 91057 Evry, France
Nature 449:463-7. 2007
..Furthermore, we explain the chronology of previously described whole-genome duplication events in the evolution of flowering plants...
- Accurate whole human genome sequencing using reversible terminator chemistry
David R Bentley
Illumina Cambridge Ltd Formerly Solexa Ltd, Chesterford Research Park, Little Chesterford, Nr Saffron Walden, Essex CB10 1XL, UK
Nature 456:53-9. 2008
..Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications...
- GMAP: a genomic mapping and alignment program for mRNA and EST sequences
Thomas D Wu
Department of Bioinformatics Genentech, Inc, South San Francisco, CA 94080, USA
Bioinformatics 21:1859-75. 2005
- Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx
University of Texas at Austin, 1 University Station C0930, Austin, TX 78712, USA
BMC Genomics 10:219. 2009
..We have applied these methods to sequence the transcriptome of planulae larvae from the coral Acropora millepora...
- The genomic landscapes of human breast and colorectal cancers
Laura D Wood
Ludwig Center for Cancer Genetics and Therapeutics and Howard Hughes Medical Institute at Johns Hopkins Kimmel Cancer Center, Baltimore, MD 21231, USA
Science 318:1108-13. 2007
..These results have implications for understanding the nature and heterogeneity of human cancers and for using personal genomics for tumor diagnosis and therapy...
- Initial sequencing and comparative analysis of the mouse genome
Robert H Waterston
Genome Sequencing Center, Washington University School of Medicine, Campus Box 8501, 4444 Forest Park Avenue, St Louis, Missouri 63108, USA
Nature 420:520-62. 2002
- An integrated software system for analyzing ChIP-chip and ChIP-seq data
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, Maryland 21205, USA
Nat Biotechnol 26:1293-300. 2008
- PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls
Molecular Biophysics and Biochemistry Dept, Yale University, PO Box 208114, New Haven, Connecticut 06520 8114, USA
Nat Biotechnol 27:66-75. 2009
..Our scoring procedure enables us to optimize experimental design by estimating the depth of sequencing required for a desired level of coverage and demonstrating that more than two replicates provides only a marginal gain in information...
- Datamonkey: rapid detection of selective pressure on individual sites of codon alignments
Sergei L Kosakovsky Pond
Antiviral Research Center, University of California, San Diego, CA 92103, USA
Bioinformatics 21:2531-3. 2005
..The methods range from very fast data exploration to the some of the most complex models available in public domain software, and are implemented to run in parallel on a cluster of computers...
- High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome
School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, USA
BMC Genomics 9:312. 2008
..However, it is questionable how effective the sequencing of large numbers of short reads for species with essentially no prior gene sequence information will support contig assemblies and sequence annotation...
- MEME SUITE: tools for motif discovery and searching
Timothy L Bailey
Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
Nucleic Acids Res 37:W202-8. 2009
..All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net...
- Personalized copy number and segmental duplication maps using next-generation sequencing
Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA
Nat Genet 41:1061-7. 2009
..2 x 10(-16)). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology...
- Genome-wide mapping of in vivo protein-DNA interactions
David S Johnson
Department of Genetics, Stanford University School of Medicine, Stanford, CA, 94305 5120, USA
Science 316:1497-502. 2007
..96] and statistical confidence (P <10(-4)), properties that were important for inferring new candidate interactions. These include key transcription factors in the gene network that regulates pancreatic islet cell development...
- Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex
Department of Computer Science, UCB 430, University of Colorado, Boulder, Colorado 80309, USA
Nat Methods 5:235-7. 2008
- Automated generation of heuristics for biological sequence comparison
Guy St C Slater
The Ensembl Group, EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
BMC Bioinformatics 6:31. 2005
- SOAP2: an improved ultrafast tool for short read alignment
Beijing Genomics Institute at Shenzhen, Shenzhen, 518083, China
Bioinformatics 25:1966-7. 2009
..Additionally, this tool now supports multiple text and compressed file formats. A consensus builder has also been developed for consensus assembly and SNP detection from alignment of short reads on a reference genome...
- Real-time DNA sequencing from single polymerase molecules
Pacific Biosciences, 1505 Adams Drive, Menlo Park, CA 94025, USA
Science 323:133-8. 2009
..Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates...
- NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
Kim D Pruitt
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Rm 6An 12J, 45 Center Drive, Bethesda, MD 20892 6510, USA
Nucleic Acids Res 35:D61-5. 2007
..The format of all RefSeq records is validated, and an increasing number of tests are being applied to evaluate the quality of sequence and annotation, especially in the context of complete genomic sequence...
- Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing
Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA
Nat Biotechnol 27:182-9. 2009
..One lane of Illumina sequence was sufficient to call high-confidence genotypes for 89% of the targeted exon space...
- Applications of next-generation sequencing technologies in functional genomics
BC Cancer Agency Genome Sciences Centre, Vancouver, BC, Canada
Genomics 92:255-64. 2008
..This review discusses applications of next-generation sequencing technologies in functional genomics research and highlights the transforming potential these technologies offer...
- Next-generation sequencing transforms today's biology
Stephan C Schuster
Pennsylvania State University, Center for Comparative Genomics and Bioinformatics, 310 Wartik Building, University Park, Pennsylvania 16802, USA
Nat Methods 5:16-8. 2008
..However, before stepping into the limelight, next-generation sequencing had to overcome the inertia of a field that relied on Sanger-sequencing for 30 years...
- Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor
Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, USA
BMC Bioinformatics 7:474. 2006
..Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases...
- A metagenomic survey of microbes in honey bee colony collapse disorder
Diana L Cox-Foster
Department of Entomology, Pennsylvania State University, University Park, PA 16802, USA
Science 318:283-7. 2007
..One organism, Israeli acute paralysis virus of bees, was strongly correlated with CCD...
- The genome of black cottonwood, Populus trichocarpa (Torr. & Gray)
G A Tuskan
Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
Science 313:1596-604. 2006
..Overrepresented exceptions in Populus include genes associated with lignocellulosic wall biosynthesis, meristem development, disease resistance, and metabolite transport...
- A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters
Biomedical Informatics Program, Stanford University, Stanford, CA 94305, USA
Proc Natl Acad Sci U S A 103:1412-7. 2006
- A phylogenomic study of birds reveals their evolutionary history
Shannon J Hackett
Zoology Department, Field Museum of Natural History, 1400 South Lake Shore Drive, Chicago, IL 60605, USA
Science 320:1763-8. 2008
..Our results provide a valuable resource for phylogenetic and comparative studies in birds...
- VISTA: computational tools for comparative genomics
Kelly A Frazer
Perlegen Sciences, Inc, 2021 Stierlin Court, Mountain View, CA 94043, USA
Nucleic Acids Res 32:W273-9. 2004
..We illustrate capabilities of the VISTA site by the analysis of a 180 kb interval on human chromosome 5 that encodes for the kinesin family member 3A (KIF3A) protein...
- Mapping short DNA sequencing reads and calling variants using mapping quality scores
The Wellcome Trust Sanger Institute, Hinxton CB10 1SA, United Kingdom
Genome Res 18:1851-8. 2008
..Both read mapping and genotype calling are evaluated on simulated data and real data. MAQ is accurate, efficient, versatile, and user-friendly. It is freely available at http://maq.sourceforge.net...
- Evaluation of methods for detecting recombination from DNA sequences: computer simulations
Department of Zoology, Brigham Young University, Provo, UT 84602, USA
Proc Natl Acad Sci U S A 98:13757-62. 2001
..Results shown here will provide some guidance in the selection of the most appropriate method/s for the analysis of the particular data at hand...
- Next-generation DNA sequencing
Department of Genome Sciences, University of Washington, Seattle, Washington 98195 5065, USA
Nat Biotechnol 26:1135-45. 2008
- Exome sequencing identifies the cause of a mendelian disorder
Sarah B Ng
Department of Genome Sciences, University of Washington, Seattle, Washington, USA
Nat Genet 42:30-5. 2010
..Exome sequencing of a small number of unrelated affected individuals is a powerful, efficient strategy for identifying the genes underlying rare mendelian disorders and will likely transform the genetic analysis of monogenic traits...
- The genome sequence of the filamentous fungus Neurospora crassa
James E Galagan
Whitehead Institute Center for Genome Research, 320 Charles Street, Cambridge, Massachusetts 02141, USA
Nature 422:859-68. 2003
..Genome analysis suggests that RIP has had a profound impact on genome evolution, greatly slowing the creation of new genes through genomic duplication and resulting in a genome with an unusually low proportion of closely related genes...
- Distribution and intensity of constraint in mammalian genomic sequence
Gregory M Cooper
Department of Genetics, Stanford University, Stanford, California 94305, USA
Genome Res 15:901-13. 2005
..We anticipate that GERP and the types of analyses it facilitates will provide further insights and improved annotation for the human genome as mammalian genome sequence data become richer...
- Comparative analysis of human gut microbiota by barcoded pyrosequencing
Anders F Andersson
Swedish Institute for Infectious Disease Control, Solna, Sweden
PLoS ONE 3:e2836. 2008
..Here we applied the technique to analyze microbial communities in throat, stomach and fecal samples. Our results demonstrate the applicability of barcoded pyrosequencing as a high-throughput method for comparative microbial ecology...
- Complete genome sequence of Salmonella enterica serovar Typhimurium LT2
Sidney Kimmel Cancer Center, 10835 Altman Row, San Diego, California 92121, USA
Nature 413:852-6. 2001
..Most of these homologues were previously unknown, and 50 may be exported to the periplasm or outer membrane, rendering them accessible as therapeutic or vaccine targets...
- Sequence-based species delimitation for the DNA taxonomy of undescribed insects
Department of Entomology, The Natural History Museum, London SW7 5BD, United Kingdom
Syst Biol 55:595-609. 2006
- A large genome center's improvements to the Illumina sequencing system
Michael A Quail
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
Nat Methods 5:1005-10. 2008
- Gene prediction with a hidden Markov model and a new intron submodel
Institut fur Mikrobiologie und Genetik, Abteilung Bioinformatik, Universitat Gottingen, Gottingen, Germany
Bioinformatics 19:ii215-25. 2003
..Gene finding programs have achieved relatively high accuracy on short genomic sequences but do not perform well on longer sequences with an unknown number of genes in them. Here existing programs tend to predict many false exons...
- Exploring microbial diversity and taxonomy using SSU rRNA hypervariable tag sequencing
Susan M Huse
Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts, United States of America
PLoS Genet 4:e1000255. 2008
..This technique allows the cost-effective exploration of changes in microbial community structure, including the rare biosphere, over space and time and can be applied immediately to initiatives, such as the Human Microbiome Project...
- Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA
R M Andrews
Nat Genet 23:147. 1999
- Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome
Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH 4058 Basel, Switzerland
Nat Genet 39:457-66. 2007
..Moreover, we observe that inactive unmethylated CpG island promoters show elevated levels of dimethylation of Lys4 of histone H3, suggesting that this chromatin mark may protect DNA from methylation...
- Community genomics among stratified microbial assemblages in the ocean's interior
Edward F DeLong
Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Science 311:496-503. 2006
..Comparative genomic analyses of stratified microbial communities have the potential to provide significant insight into higher-order community organization and dynamics...
- ABySS: a parallel assembler for short read sequence data
Jared T Simpson
Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia V5Z 4E6, Canada
Genome Res 19:1117-23. 2009
..Analysis of these contigs identified polymorphic and novel sequences not present in the human reference assembly, which were validated by alignment to alternate human assemblies and to other primate genomes...
- Consed: a graphical tool for sequence finishing
Department of Molecular Biotechnology, University of Washington, Seattle, Washington 98195 7730, USA
Genome Res 8:195-202. 1998
..More information is available at http:// www.genome.washington.edu/consed/consed. html...
- Design and analysis of ChIP-seq experiments for DNA-binding proteins
Peter V Kharchenko
Center for Biomedical Informatics, Harvard Medical School, 10 Shattuck St, Boston, Massachusetts 02115, USA
Nat Biotechnol 26:1351-9. 2008
..We also analyze the relationship between the depth of sequencing and characteristics of the detected binding positions, and provide a method for estimating the sequencing depth necessary for a desired coverage of protein binding sites...
- Complete genome sequence of a virulent isolate of Streptococcus pneumoniae
The Institute for Genomic Research (TIGR, 9712 Medical Center Drive, Rockville, MD 20850, USA
Science 293:498-506. 2001
..Comparative genome hybridization with DNA arrays revealed strain differences in S. pneumoniae that could contribute to differences in virulence and antigenicity...
- High-resolution mapping and characterization of open chromatin across the genome
Alan P Boyle
Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA
Cell 132:311-22. 2008
..In addition, and unexpectedly, our analyses have uncovered detailed features of nucleosome structure...
- TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders
W H Majoros
Bioinformatics Department, The Institute for Genomic Research, Rockville, MD 20850, USA
Bioinformatics 20:2878-9. 2004
..Both programs have been used at TIGR for the annotation of the Aspergillus fumigatus and Toxoplasma gondii genomes...
- Biological identifications through DNA barcodes
Paul D N Hebert
Department of Zoology, University of Guelph, Guelph, Ontario N1G 2W1, Canada
Proc Biol Sci 270:313-21. 2003
..Its assembly will also generate important new insights into the diversification of life and the rules of molecular evolution...
- Evaluation of next generation sequencing platforms for population targeted sequencing studies
Scripps Genomic Medicine, Scripps Translational Science Institute, The Scripps Research Institute, La Jolla, CA 92037, USA
Genome Biol 10:R32. 2009
..To evaluate these platforms for this application, we analyzed human sequence generated by the Roche 454, Illumina GA, and the ABI SOLiD technologies for the same 260 kb in four individuals...
- Targeted capture and massively parallel sequencing of 12 human exomes
Sarah B Ng
Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
Nature 461:272-6. 2009
..This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of non-synonymous variants by predicted functional impact...
- Deciphering human immunodeficiency virus type 1 transmission and early envelope diversification by single-genome amplification and sequencing
Jesus F Salazar-Gonzalez
Department of Medicine, University of Alabama at Birmingham, 720 20th Street South, Kaul 816, Birmingham, AL 35294, USA
J Virol 82:3952-70. 2008
- Tablet--next generation sequence assembly visualization
Genetics Programme, Scottish Crop Research Institute, Invergowrie, Dundee, DD2 5DA, UK
Bioinformatics 26:401-2. 2010
..Tablet is both multi-core aware and memory efficient, allowing it to handle assemblies containing millions of reads, even on a 32-bit desktop machine...
- Comparative genomics of the neglected human malaria parasite Plasmodium vivax
Jane M Carlton
The Institute for Genomic Research J Craig Venter Institute, 9704 Medical Research Drive, Rockville, Maryland 20850, USA
Nature 455:757-63. 2008
..Completion of the P. vivax genome provides the scientific community with a valuable resource that can be used to advance investigation into this neglected species...
- The Ensembl genome database project
The Wellcome Trust Sanger Institute and European Bioinformatics Institute EMBL EBI, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
Nucleic Acids Res 30:38-41. 2002
..The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops...
- The impact of next-generation sequencing technology on genetics
Elaine R Mardis
Genome Sequencing Center, Washington University School of Medicine, St Louis, MO 63108, USA
Trends Genet 24:133-41. 2008
..Here I survey next-generation sequencing technologies and consider how they can provide a more complete picture of how the genome shapes the organism...
- SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries
Curtis P van Tassell
Bovine Functional Genomics Laboratory, United States Department of Agriculture, Agricultural Research Service, 10300 Baltimore Avenue, Beltsville, Maryland 20705, USA
Nat Methods 5:247-52. 2008
..67. This approach for simultaneous de novo discovery of high-quality SNPs and population characterization of allele frequencies may be applied to any species with at least a partially sequenced genome...
- Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Burnham Institute for Medical Research La Jolla, CA 92037, USA
Bioinformatics 22:1658-9. 2006
..All these programs can handle huge datasets with millions of sequences and can be hundreds of times faster than methods based on the popular sequence comparison and database search tools, such as BLAST...
- Short pyrosequencing reads suffice for accurate microbial community analysis
Department of Chemistry and Biochemistry, UCB 215, University of Colorado at Boulder, Boulder, CO 80309 0215, USA
Nucleic Acids Res 35:e120. 2007
- MEME: discovering and analyzing DNA and protein sequence motifs
Timothy L Bailey
Institute of Molecular Bioscience, The University of Queensland, St Lucia, QLD 4072, Australia
Nucleic Acids Res 34:W369-73. 2006
..This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance...
- The uncultured microbial majority
Michael S Rappé
Department of Microbiology, Oregon State University, Corvallis, Oregon 97331, USA
Annu Rev Microbiol 57:369-94. 2003
..Genome sequence information that would allow ribosomal RNA gene trees to be related to broader patterns in microbial genome evolution is scant, and therefore microbial diversity remains largely unexplored territory...
- Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential
Department of Plant Breeding, USDA-ARS Center for Agricultural Bioinformatics, Cornell University, Ithaca, New York 14853-1901, USA
Genome Res 11:1441-52. 2001
..This contribution brings the number of microsatellite markers that have been rigorously evaluated for amplification, map position, and allelic diversity in Oryza spp. to a total of 500...
- Java Treeview--extensible visualization of microarray data
Alok J Saldanha
Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
Bioinformatics 20:3246-8. 2004
..An applet version is also available that can be used on any website with no special server-side setup...
- An Eulerian path approach to DNA fragment assembly
P A Pevzner
Department of Computer Science and Engineering, University of California, San Diego, La Jolla, USA
Proc Natl Acad Sci U S A 98:9748-53. 2001
..euler, in contrast to the celera assembler, does not mask such repeats but uses them instead as a powerful fragment assembly tool...