Protein-protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes.

Computational docking has become an indispensable alternative to the expensive and timeconsuming experimental approaches for determining 3D structures of protein complexes. Despite recent progress, identifying near-native models from a large set of conformations sampled by docking – the so-called scoring problem – still has considerable room for improvement.

We present here MetaScore, a new machine-learning based approach to improve the scoring of docked conformations. MetaScore utilizes a random forest (RF) classifier trained to distinguish near-native from non-native conformations using a rich set of features extracted from the respective protein-protein interfaces. These include physico-chemical properties, energy terms, interaction propensity-based features, geometric properties, interface topology features, evolutionary conservation and also scores produced by traditional scoring functions (SFs).

MetaScore scores docked conformations by simply averaging of the score produced by the RF classifier with that produced by any traditional SF. We demonstrate that:

  1. MetaScore consistently outperforms each of nine traditional SFs included in this work in terms of success rate and hit rate evaluated over the top 10 predicted conformations.
  2. An ensemble method, MetaScore-Ensemble, that combines 10 variants of MetaScore obtained by combining the RF score with each of the traditional SFs outperforms each of the MetaScore variants.

We conclude that the performance of traditional SFs can be improved upon by judiciously leveraging machine-learning.

[maxbutton id=”4″ url=”″ text=”Read more” linktitle=”bioRxiv: MetaScore: A novel machine-learning based approach to improve traditional scoring functions for scoring protein-protein docking conformations” ]


Yong Jung, Cunliang Geng, Alexandre M. J. J. Bonvin, Li C. Xue, Vasant G. Honavar (2022):
MetaScore: A novel machine-learning based approach to improve traditional scoring functions for scoring protein-protein docking conformations.
bioRxiv 2021.10.06.463442 (preprint)

About the author

Stian works in School of Computer Science, at the University of Manchester in Carole Goble‘s eScience Lab as a technical software architect and researcher. In addition to BioExcel, Stian’s involvements include Open PHACTS (pharmacological data warehouse), Common Workflow Language (CWL), Apache Taverna (scientific workflow system), Linked Data and identifiers, research objects (open science) and digital preservation, myExperiment (sharing scientific workflows), provenance (where did things come from and who did it) and annotations (who said what).