- A survey of sequence alignment algorithms for next-generation sequencing
Broad Institute, Cambridge, MA 02142, USA
Brief Bioinform 11:473-83. 2010
..We also consider future development of alignment algorithms with respect to emerging long sequence reads and the prospect of cloud computing...
- Pseudo-Sanger sequencing: massively parallel production of long and near error-free reads using NGS technology
Laboratory of Disease Genomics and Individualized Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, People s Republic of China
BMC Genomics 14:711. 2013
..Now that Illumina paired-end sequencing has the ability of read both ends from 600 bp or even 800 bp DNA fragments, how to fill in the gaps between paired ends to produce accurate long reads is intriguing but challenging...
- Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly
Medical Population Genetics Program, Broad Institute, 7 Cambridge Center, MA 02142, USA
Bioinformatics 28:1838-44. 2012
..In principle, every analysis based on whole-genome shotgun sequencing (WGS) data, such as SNP and insertion/deletion (INDEL) calling, can also be achieved with unitigs...
- Tabix: fast retrieval of sequence features from generic TAB-delimited files
Program in Medical Population Genetics, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
Bioinformatics 27:718-9. 2011
..It is particularly useful for manually examining local genomic features on the command line and enables genome viewers to support huge data files and remote custom tracks over networks...
- Improving SNP discovery by base alignment quality
Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, USA
Bioinformatics 27:1157-8. 2011
..The effectiveness of BAQ has been positively confirmed on large datasets by the 1000 Genomes Project analysis subgroup...
- A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
Medical Population Genetics Program, Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, USA
Bioinformatics 27:2987-93. 2011
..g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty...
- Mapping the human reference genome's missing sequence by three-way admixture in Latino genomes
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Am J Hum Genet 93:411-21. 2013
..Genome assembly efforts and future builds of the human genome reference will be strongly informed by this localization of genes and other euchromatic sequences that are embedded within highly repetitive pericentromeric regions. ..
- The Sequence Alignment/Map format and SAMtools
Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK, Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA
Bioinformatics 25:2078-9. 2009
..SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments...
- Using population admixture to help complete maps of the human genome
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
Nat Genet 45:406-14, 414e1-2. 2013
..We describe how knowledge of the locations of these sequences can inform disease association and genome biology studies...
- A direct characterization of human mutation based on microsatellites
James X Sun
Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
Nat Genet 44:1161-5. 2012
..We infer that the sequence mutation rate is 1.4-2.3×10(-8) mutations per base pair per generation (90% credible interval) and that human-chimpanzee speciation occurred 3.7-6.6 million years ago...