Fengzhu Sun

Summary

Affiliation: University of Southern California
Country: USA

Publications

  1. pmc New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing
    Kai Song
    Molecular and Computational Biology Program, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA or
    Brief Bioinform 15:343-53. 2014
  2. pmc Comparison of metagenomic samples using sequence signatures
    Bai Jiang
    MOE Key Laboratory of Bioinformatics, Bioinformatics Division and Center for Synthetic and Systems Biology, TNLIST Department of Automation, Tsinghua University, Beijing 100084, China
    BMC Genomics 13:730. 2012
  3. pmc Conservation and implications of eukaryote transcriptional regulatory regions across multiple species
    Lin Wan
    School of Mathematical Sciences, Peking University, Beijing, PR China
    BMC Genomics 9:623. 2008
  4. pmc Assessing the power of tag SNPs in the mapping of quantitative trait loci (QTL) with extremal and random samples
    Kui Zhang
    Department of Biostatistics, School of Public Health, University of Alabama, Birmingham, AL 35294, USA
    BMC Genet 6:51. 2005
  5. pmc Variance adjusted weighted UniFrac: a powerful beta diversity measure for comparing communities based on phylogeny
    Qin Chang
    School of Mathematics, Shandong University, Jinan, Shandong, PR China
    BMC Bioinformatics 12:118. 2011
  6. pmc Usefulness and limitations of dK random graph models to predict interactions and functional homogeneity in biological networks under a pseudo-likelihood parameter estimation approach
    WenHui Wang
    School of Mathematics, Shandong University, Jinan, Shandong 250100, PR China
    BMC Bioinformatics 10:277. 2009
  7. pmc Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies
    Kui Zhang
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089 1113, USA
    Genome Res 14:908-16. 2004
  8. pmc Chromatin regulation and gene centrality are essential for controlling fitness pleiotropy in yeast
    Linqi Zhou
    Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
    PLoS ONE 4:e8086. 2009
  9. pmc An integrated approach to the prediction of domain-domain interactions
    Hyunju Lee
    Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
    BMC Bioinformatics 7:269. 2006
  10. ncbi Prediction of protein function using protein-protein interaction data
    Minghua Deng
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 1113, USA
    Proc IEEE Comput Soc Bioinform Conf 1:197-206. 2002

Detail Information

Publications54

  1. pmc New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing
    Kai Song
    Molecular and Computational Biology Program, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA or
    Brief Bioinform 15:343-53. 2014
    ....
  2. pmc Comparison of metagenomic samples using sequence signatures
    Bai Jiang
    MOE Key Laboratory of Bioinformatics, Bioinformatics Division and Center for Synthetic and Systems Biology, TNLIST Department of Automation, Tsinghua University, Beijing 100084, China
    BMC Genomics 13:730. 2012
    ..Still, the applications of sequence signature methods for the comparison of metagenomic samples have not been well studied...
  3. pmc Conservation and implications of eukaryote transcriptional regulatory regions across multiple species
    Lin Wan
    School of Mathematical Sciences, Peking University, Beijing, PR China
    BMC Genomics 9:623. 2008
    ....
  4. pmc Assessing the power of tag SNPs in the mapping of quantitative trait loci (QTL) with extremal and random samples
    Kui Zhang
    Department of Biostatistics, School of Public Health, University of Alabama, Birmingham, AL 35294, USA
    BMC Genet 6:51. 2005
    ..It is also important to know how much power is gained when tag SNPs instead of the same number of randomly chosen SNPs are used...
  5. pmc Variance adjusted weighted UniFrac: a powerful beta diversity measure for comparing communities based on phylogeny
    Qin Chang
    School of Mathematics, Shandong University, Jinan, Shandong, PR China
    BMC Bioinformatics 12:118. 2011
    ..However, W-UniFrac does not consider the variation of the weights under random sampling resulting in less power detecting the differences between communities...
  6. pmc Usefulness and limitations of dK random graph models to predict interactions and functional homogeneity in biological networks under a pseudo-likelihood parameter estimation approach
    WenHui Wang
    School of Mathematics, Shandong University, Jinan, Shandong 250100, PR China
    BMC Bioinformatics 10:277. 2009
    ..Complex statistical network models can potentially more accurately describe the networks, but it is not clear whether such complex models are better suited to find biologically meaningful subnetworks...
  7. pmc Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies
    Kui Zhang
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089 1113, USA
    Genome Res 14:908-16. 2004
    ..The power of association studies based on tag SNPs using genotype data is similar to that using haplotype data...
  8. pmc Chromatin regulation and gene centrality are essential for controlling fitness pleiotropy in yeast
    Linqi Zhou
    Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
    PLoS ONE 4:e8086. 2009
    ..Previously, the functions of gene products that distinguish essential from nonessential genes were characterized. However, the functions of products of non-essential genes that contribute to fitness remain minimally understood...
  9. pmc An integrated approach to the prediction of domain-domain interactions
    Hyunju Lee
    Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
    BMC Bioinformatics 7:269. 2006
    ..The availability of many diverse biological data sets provides an opportunity to discover the underlying domain interactions within protein interactions through an integration of these biological data sets...
  10. ncbi Prediction of protein function using protein-protein interaction data
    Minghua Deng
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 1113, USA
    Proc IEEE Comput Soc Bioinform Conf 1:197-206. 2002
    ..gsf.de). We show that our approach outperforms other available methods for function prediction based on protein interaction data...
  11. pmc Inferring activity changes of transcription factors by binding association with sorted expression profiles
    Chao Cheng
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 2910, USA
    BMC Bioinformatics 8:452. 2007
    ..In some other models, the target genes of a TF are first determined by a significance cutoff to binding affinity scores, and then expression differentiation is checked between the target and other genes...
  12. pmc Sequence-based prioritization of nonsynonymous single-nucleotide polymorphisms for the study of disease mutations
    Rui Jiang
    Molecular and Computational Biology Program, Signal and Image Processing Institute, Department of Electrical Engineering, University of Southern California, Los Angeles, CA 90089 2910, USA
    Am J Hum Genet 81:346-60. 2007
    ..Application of this approach to unclassified mutations suggests that there are 10 suspicious mutations likely to cause diseases, and there is strong support for this in the literature...
  13. ncbi Diffusion kernel-based logistic regression models for protein function prediction
    Hyunju Lee
    Department of Computer Science, University of Southern California, Los Angeles, 90089, USA
    OMICS 10:40-55. 2006
    ..The advantages of the KLR model include its simplicity as well as its ability to explore the contribution of neighbors to the functions of proteins of interest...
  14. ncbi CGI: a new approach for prioritizing genes by combining gene expression and protein-protein interaction data
    Xiaotu Ma
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 2910, USA
    Bioinformatics 23:215-21. 2007
    ..Several integrative methods have been developed when a set of candidate genes for the phenotype is available. However, how to prioritize genes for phenotypes when no candidates are available is still a challenging problem...
  15. ncbi An integrative approach for causal gene identification and gene regulatory pathway inference
    Zhidong Tu
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, USA
    Bioinformatics 22:e489-96. 2006
    ..Novel approaches are needed to both infer the causal genes and generate hypothesis on the underlying regulatory mechanisms...
  16. ncbi MARD: a new method to detect differential gene expression in treatment-control time courses
    Chao Cheng
    Molecular and Computational Biology Program, Department of Biological Sciences, Computational Biology, University of Southern California Los Angeles, CA, USA
    Bioinformatics 22:2650-7. 2006
    ..We propose a novel method to identify the differentially expressed genes between two time courses, which avoids direct comparison of gene expression patterns between the two time courses...
  17. ncbi Prediction of protein function using protein-protein interaction data
    Minghua Deng
    Program in Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 1113, USA
    J Comput Biol 10:947-60. 2003
    ..gsf.de). We show that our approach outperforms other available methods for function prediction based on protein interaction data. The supplementary data is available at www-hto.usc.edu/~msms/ProteinFunction...
  18. pmc The effects of protein interactions, gene essentiality and regulatory regions on expression variation
    Linqi Zhou
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 2913, USA
    BMC Syst Biol 2:54. 2008
    ..Most importantly we study how these factors interact with each other influencing gene expression variation...
  19. pmc Detecting susceptibility genes in case-control studies using set association
    Sung Kim
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, California, USA
    BMC Genet 4:S9. 2003
    ..This is due to the wide spacing between the markers and the lack of association between the marker loci and the simulated phenotype...
  20. pmc Alignment-free sequence comparison (II): theoretical power of comparison statistics
    Lin Wan
    Molecular and Computational Biology, University of Southern California, Los Angeles, California 90089 2910, USA
    J Comput Biol 17:1467-90. 2010
    ..The program to calculate the power of D2, D2* and D2S can be downloaded from http://meta.cmb.usc.edu/d2. Supplementary Material is available at www.liebertonline.com/cmb...
  21. pmc Multiple alignment-free sequence comparison
    Jie Ren
    School of Mathematics, Peking University, Beijing 100871, PR China, Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089 2910, USA, MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST Department of Automation, Tsinghua University, Beijing 100084, PR China and Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
    Bioinformatics 29:2690-8. 2013
    ..The two tasks we consider are, first, to identify sequences that are similar to a set of target sequences, and, second, to measure the similarity within a set of sequences...
  22. pmc A model-based approach to selection of tag SNPs
    Pierre Nicolas
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, USA
    BMC Bioinformatics 7:303. 2006
    ..It also provides a machinery for the prediction of tagged SNPs and thereby to assess the performances of tag sets through their ability to predict larger SNP sets...
  23. pmc Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates
    Li C Xia
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 2910, USA
    BMC Syst Biol 5:S15. 2011
    ..With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval...
  24. ncbi The mutation process of microsatellites during the polymerase chain reaction
    Yinglei Lai
    Department of Mathematics, University of Southern California, Los Angeles, CA 90089 1113, USA
    J Comput Biol 10:143-55. 2003
    ..The theoretical basis for the proposed method is also given. We apply the method to experimental data on poly-A and poly-CA repeats...
  25. pmc Marine bacterial, archaeal and protistan association networks reveal ecological linkages
    Joshua A Steele
    Department of Biological Sciences and Wrigley Institute for Environmental Studies, University of Southern California, Los Angeles, CA, USA
    ISME J 5:1414-25. 2011
    ..This approach provides new insights into the natural history of microbes...
  26. ncbi Mapping Gene Ontology to proteins based on protein-protein interaction data
    Minghua Deng
    Department of Biological Sciences, Molecular and Computational Biology Program, University of Southern California, 1042 West 36th Place, Los Angeles, CA 90089 1113, USA
    Bioinformatics 20:895-902. 2004
    ..Combining both GO and protein-protein interaction data allows the prediction of function for unknown proteins...
  27. ncbi A dynamic programming algorithm for binning microbial community profiles
    Quansong Ruan
    Department of Mathematics, University of Southern California 3620 South Vermont Avenue, KAP 108, Los Angeles, California 90089 253, USA
    Bioinformatics 22:1508-14. 2006
    ..We have developed a dynamic programming algorithm based binning method for ARISA data analysis which minimizes the overall differences between replicates from the same sampling location and time...
  28. pmc Ecdysone receptor acts in fruitless- expressing neurons to mediate drosophila courtship behaviors
    Justin E Dalton
    Section of Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
    Curr Biol 19:1447-52. 2009
    ..Thus, EcR-A is required in fru P1-expressing neurons for wild-type male courtship behaviors and the establishment of male-specific neuronal architecture...
  29. pmc Accurate genome relative abundance estimation based on shotgun metagenomic reads
    Li C Xia
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, California, United States of America
    PLoS ONE 6:e27992. 2011
    ..It will be increasingly useful as an accurate and robust tool for abundance estimation with the growing size of read sets and the expanding database of reference genomes...
  30. ncbi An integrated probabilistic model for functional prediction of proteins
    Minghua Deng
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, 1042 West 36th Place, Los Angeles, CA 90089 1113, USA
    J Comput Biol 11:463-75. 2004
    ..In contrast to using MIPS physical interactions only, the integrated approach combining all of the information increases the recall from 57% to 87% when the precision is set at 57%-an increase of 30%...
  31. pmc Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach
    Wangshu Zhang
    MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST Department of Automation, Tsinghua University, Beijing 10084, China
    BMC Bioinformatics 12:S11. 2011
    ..Meanwhile, independently constructed and maintained PPI networks are usually quite diverse in coverage and quality, making the selection of a suitable PPI network inevitable but difficult...
  32. pmc CEDER: accurate detection of differentially expressed genes by combining significance of exons using RNA-Seq
    Lin Wan
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA
    IEEE/ACM Trans Comput Biol Bioinform 9:1281-92. 2012
    ..We showed that CEDER can significantly increase the accuracy of existing methods for detecting DEGs on two benchmark RNA-Seq data sets and simulated datasets...
  33. pmc DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases
    Wangshu Zhang
    MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST Department of Automation, Tsinghua University, Beijing, China
    BMC Syst Biol 5:55. 2011
    ..Based on this assumption, we propose a Bayesian regression approach named "domainRBF" (domain Rank with Bayes Factor) to prioritize candidate domains for human complex diseases...
  34. ncbi Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors
    Quansong Ruan
    Department of Mathematics, University of Southern California 3620 Vermont Avenue, KAP 108, Los Angeles, CA 90089 2532, USA
    Bioinformatics 22:2532-8. 2006
    ..However, this approach may not be able to capture more complex interactions which occur in situ; thus, alternative analyses were explored...
  35. pmc Searching for interpretable rules for disease mutations: a simulated annealing bump hunting strategy
    Rui Jiang
    Molecular and Computational Biology, University of Southern California, MCB201, 1050 Childs Way, Los Angeles, CA 90089 2910, USA
    BMC Bioinformatics 7:417. 2006
    ..Another limitation is that the prediction results are hard to be interpreted with physicochemical principles and biological knowledge...
  36. pmc Testing gene set enrichment for subset of genes: Sub-GSE
    Xiting Yan
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 2910, USA
    BMC Bioinformatics 9:362. 2008
    ..However, while most available methods for gene set enrichment analysis test the enrichment of the entire gene set, it is more likely that only a subset of the genes in the gene set may be related to the phenotypes of interest...
  37. pmc Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data
    Kui Zhang
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, 1042 W 36th Place DRB 290, Los Angeles, CA 90089 1113, USA
    Am J Hum Genet 73:63-73. 2003
    ..We also calculate the distribution of block break points in intergenic regions, genes, exons, and coding regions and do not find any significant differences...
  38. doi Efficient statistical significance approximation for local similarity analysis of high-throughput time series data
    Li C Xia
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089 2910, USA
    Bioinformatics 29:230-7. 2013
    ..However, its applications to large scale high-throughput data are limited by slow permutation procedures for statistical significance evaluation...
  39. pmc Modeling RNA degradation for RNA-Seq with applications
    Lin Wan
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA
    Biostatistics 13:734-47. 2012
    ..In addition, the RNA degradation rate from our model is independent of the RNA length, consistent with previous studies on RNA decay rate...
  40. pmc Network motif identification in stochastic networks
    Rui Jiang
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA
    Proc Natl Acad Sci U S A 103:9404-9. 2006
    ....
  41. pmc Prioritizing functional modules mediating genetic perturbations and their phenotypic effects: a global strategy
    Li Wang
    Molecular and Computational Biology, Department of Biology Sciences, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089 2910, USA
    Genome Biol 9:R174. 2008
    ..We discovered that lethality is more conserved at the module level than at the gene level and we identified several potentially 'new' cancer-related biological processes...
  42. ncbi Sampling distribution for microsatellites amplified by PCR: mean field approximation and its applications to genotyping
    Yinglei Lai
    Department of Mathematics, University of Southern California, 1042 West 36th Place, DRB 288 Los Angeles, CA 90089 1113, USA
    J Theor Biol 228:185-94. 2004
    ..Based on the theories of mean field approximation and Bayesian statistics, we develop a novel method for microsatellite stutter pattern deconvolution...
  43. pmc Further understanding human disease genes by comparing with housekeeping genes and other genes
    Zhidong Tu
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, California 90089, USA
    BMC Genomics 7:31. 2006
    ..Here we perform a comparative study on the features of human essential, disease, and other genes...
  44. pmc A network-based integrative approach to prioritize reliable hits from multiple genome-wide RNAi screens in Drosophila
    Li Wang
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA
    BMC Genomics 10:220. 2009
    ..Therefore, integrating RNAi screening results with other information, such as protein-protein interaction (PPI), may help to address these issues...
  45. pmc A unified approach for allele frequency estimation, SNP detection and association studies based on pooled sequencing data using EM algorithms
    Quan Chen
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089 2910, USA
    BMC Genomics 14:S1. 2013
    ....
  46. pmc Haplotype block structure and its applications to association studies: power and study designs
    Kui Zhang
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles 90089, USA
    Am J Hum Genet 71:1386-94. 2002
    ..Our study also indicates that haplotype-based analysis can be much more powerful than marker-by-marker analysis...
  47. ncbi Microsatellite mutations during the polymerase chain reaction: mean field approximations and their applications
    Yinglei Lai
    Department of Mathematics, University of Southern California, 1042 West 36th Place, DRB288, Los Angeles, CA 90089 1113, USA
    J Theor Biol 224:127-37. 2003
    ..Simulation studies show that the moment estimation method can accurately recover the true mutation rate and probability of expansion. Finally, the method is applied to experimental data from single-molecule PCR experiments...
  48. ncbi The relationship between microsatellite slippage mutation rate and the number of repeat units
    Yinglei Lai
    Department of Mathematics, Department of Biological Sciences, University of Southern California, USA
    Mol Biol Evol 20:2123-31. 2003
    ..Our results agree with the length-dependent mutation pattern observed from experimental data, and they explain the scarcity of long microsatellites...
  49. pmc Somatic, germline and sex hierarchy regulated gene expression during Drosophila metamorphosis
    Matthew S Lebo
    Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA
    BMC Genomics 10:80. 2009
    ..To understand this complex morphogenetic process at a molecular-genetic level, whole genome microarray analyses were performed...
  50. pmc Inferring domain-domain interactions from protein-protein interactions
    Minghua Deng
    Program in Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089, USA
    Genome Res 12:1540-8. 2002
    ..We found several novel protein-protein interactions such as RPS0A interacting with APG17 and TAF40 interacting with SPT3, which are consistent with the functions of the proteins...
  51. pmc A dynamic programming algorithm for haplotype block partitioning
    Kui Zhang
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, 1042 West 36th Place, DRB142, Los Angeles, CA 90089 1113, USA
    Proc Natl Acad Sci U S A 99:7335-9. 2002
    ..We also apply the dynamic programming algorithm to the same data set based on haplotype diversity. A total of 3,982 representative SNPs and 1,884 blocks are identified to account for 95% of the haplotype diversity in each block...
  52. pmc Finding genetic overlaps among diseases based on ranked gene lists
    Quan Chen
    Molecular and Computational Biology Program, University of Southern California, Los Angeles, California
    J Comput Biol 22:111-23. 2015
    ..With an example of five vision-related diseases, we demonstrate how our methods can provide insights into the relationships among diseases based on their shared genetic mechanisms. ..
  53. pmc A novel class of tests for the detection of mitochondrial DNA-mutation involvement in diseases
    Fengzhu Sun
    Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA
    Am J Hum Genet 72:1515-26. 2003
    ..The fraction of patients with HTN potentially due to mtDNA-mutation involvement is estimated at 55% (95% CI 45%-65%)...
  54. pmc Taq DNA polymerase slippage mutation rates measured by PCR and quasi-likelihood analysis: (CA/GT)n and (A/T)n microsatellites
    Deepali Shinde
    Program in Molecular and Computational Biology and Department of Mathematics, University of Southern California, Los Angeles, CA 90089, USA
    Nucleic Acids Res 31:974-80. 2003
    ..The threshold and expansion to contraction ratios are explained on the basis of the active site structure of Taq DNA polymerase and models of the energetics of slippage events, respectively...