Binding-Site Modeling with Multiple-Instance Machine-Learning

Summary

Principal Investigator: Ajay N Jain
Abstract: DESCRIPTION (provided by applicant): This proposal is entitled "Binding-Site Modeling with Multiple-Instance Machine-Learning." One of the most challenging and longest studied problems in computer-aided drug design has been affinity prediction of small molecule ligands for their cognate protein targets. Despite decades of work, quantitative structure-activity re- lationship prediction (QSAR) approaches still suffer from poor accuracy, especially when predicting outside of closely related series of molecules. Even with high-quality structures of target proteins, approaches grounded in physics are also far from robust and accurate enough for reliable use in drug lead optimization. This proposal will build upon a foundation in multiple-instance machine learning applied to computer-aided drug design problems and develop a robust, accurate, and practically applicable affinity prediction methodology. The methodology requires only ligand structures and associated activity data for training, and it induces a virtual protein binding site composed of molecular fragments. The virtual binding pocket (or "pocketmol") is used in conjunction with a scoring function developed originally for molecular docking. The pocketmol configuration is chosen such that the optimal conformation and alignment of a ligand (based on the docking scoring function), yields scores for training ligands that are close to the known experimental values. Feasibility has been demon- strated in papers involving both membrane-bound receptors and enzymes. However, multiple challenges remain and are the subject of the proposed research. There are three key issues. First, there exist many pocketmols that satisfy the requirements of fitting the training data, so general solutions must be developed to address the inductive bias of the learning procedure as well as model selection after the procedure. Second, since any particular model is the product of a learning process, it will have some domain of applicability, with some new molecules likely to be predicted well and others poorly. Further, the model will be better informed by learning with certain new molecules but not others. We must develop solutions for estimating confidence of predictions for new molecules as well as for identifying particular molecules that will be highly informative. Third, the operational application of these methods involves model building, guided chemical synthesis, and iterative refinement of models. Convincing validation will require application on temporal series of molecules synthesized for multiple targets of pharmaceutical interest. The proposed work will develop novel methods to address these challenges and will establish extensive validation on multiple pharmaceutically relevant temporal series of small molecules that were the subject of real-world lead-optimization exercises.
Funding Period: 2013-01-01 - 2016-12-31
more information: NIH RePORT

Top Publications

  1. pmc Protein function annotation by local binding site surface similarity
    Russell Spitzer
    Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, California
    Proteins 82:679-94. 2014
  2. pmc A structure-guided approach for protein pocket modeling and affinity prediction
    Rocco Varela
    Certara L P, St Louis, MO, USA
    J Comput Aided Mol Des 27:917-34. 2013
  3. pmc Prediction of off-target drug effects through data fusion
    Emmanuel R Yera
    Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
    Pac Symp Biocomput 19:160-71. 2014

Research Grants

  1. Fundamental Studies of RNA Folding
    Daniel Herschlag; Fiscal Year: 2013
  2. Rational Design of antivrials targeted to HIV-1 capsid
    Asim K Debnath; Fiscal Year: 2013
  3. CANCER CENTER SUPPORT GRANT
    TONY R HUNTER; Fiscal Year: 2013

Detail Information

Publications3

  1. pmc Protein function annotation by local binding site surface similarity
    Russell Spitzer
    Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, California
    Proteins 82:679-94. 2014
    ..A panel of 12 currently unannotated proteins was also screened, resulting in a large number of statistically significant binding site matches, some of which suggest likely functions for the poorly characterized proteins...
  2. pmc A structure-guided approach for protein pocket modeling and affinity prediction
    Rocco Varela
    Certara L P, St Louis, MO, USA
    J Comput Aided Mol Des 27:917-34. 2013
    ..Structure-guidance for the QMOD method yielded significant performance improvements, both for affinity and pose prediction, especially in cases where predictions were made on ligands very different from those used for model induction. ..
  3. pmc Prediction of off-target drug effects through data fusion
    Emmanuel R Yera
    Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
    Pac Symp Biocomput 19:160-71. 2014
    ..For prediction of off-target effects, 3D-similarity performed best as a single modality, but combining all methods produced performance gains. Striking examples of structurally surprising off-target predictions are presented. ..

Research Grants30

  1. Fundamental Studies of RNA Folding
    Daniel Herschlag; Fiscal Year: 2013
    ..abstract_text> ..
  2. Rational Design of antivrials targeted to HIV-1 capsid
    Asim K Debnath; Fiscal Year: 2013
    ..The studies described in this proposal may lead to the development of a new class of antiretroviral therapeutics targeting the HIV-1 capsid. ..
  3. CANCER CENTER SUPPORT GRANT
    TONY R HUNTER; Fiscal Year: 2013
    ....