Mining Protein Binding Sites with Flexible Surface and Chemistry Matching

Jeffrey Van Voorst
Seminar

Many proteins, and by extension protein networks and biological processes, are affected by interactions with specific small molecules. Understanding the basis and mechanism of protein small-molecule interactions is also crucial for drug discovery and design. Given the number of novel protein structures solved by groups in structural genomics initiatives, automated methods to mine structural datasets are important. This is especially applicable in the case of shared features that interact with the same molecule. In many instances, the binding sites and the physiologically relevant small molecules that interact with proteins from structural genomics are unknown. Thus, a computational tool that compares potential binding sites against a dataset of proteins that have small molecules bound can be useful to propose candidate ligands for proteins with unknown function.

While at Michigan State University, I have designed and implemented a software package, SimSite3D, that searches a dataset of binding sites for those that are similar to a given query site. The initial goal was to develop a robust tool that can quickly perform the searches. By using rigid alignments and coarse samplings of the binding sites, the initial goal has been met, and SimSite3D has now become part of Pfizer Global R&D's drug discovery toolkit.
However, the problem of identifying otherwise unrelated proteins that bind the same molecule is challenging. In protein science, binding site shape is known to be an important feature. This has been addressed by adding a triangulated mesh representation of the protein surfaces and using rigid refinements of aligned surface meshes to improve the binding site orientations. The addition of binding site surfaces and refinement of orientations has been shown to capture more remote similarities and increase the accuracy of alignments (at the cost of additional computation).

Most recently, I have added flexible refinement of binding sites using articulated surface and chemistry matching to address the fact that proteins are intrinsically flexible. This method, related to techniques in computer animation and robotics, addresses the question of whether one binding site can be morphed into another, subject to the underlying molecular constraints. The preliminary results are encouraging, but given the challenges of protein flexibility, there are ample opportunities for future work.

Click here to add this seminar to your calendar.