Unlocking new discoveries through a scale-shift in data and real-world relevant and rigorous evaluations

About PLINDER

Protein-ligand interactions are foundational to the understanding of science and discovery of therapies. However, to this date, no large, high quality datasets with real-life relevant evaluations exist.

PLINDER is an academic-industry collaboration to address this, driven by VantAI, NVIDIA, the Computational Structural Biology group at the University of Basel & SIB Swiss Institute of Bioinformatics - co-organizers of CASP, and MIT. We aim to provide a gold standard dataset and evaluations to push the field of computational protein-ligand interactions prediction forward.

Expanding the universe of accessible protein-ligand interactions
PLINDER Provides

10x More Data

Paired Unbound & Predicted Structures

High Quality Evaluations

Real-Life Relevant Metrics

Explore PLINDER

Explore Interactively

  • Each node is a plinder system with the graph laid out based on shared protein-ligand interactions
  • Change the node size and color using different system annotations in the General tab
  • Search for properties of interest (domains, oligomeric states etc.) in the Info tab
  • Check out subsets of plinder fulfilling numeric constraints in the Analysis tab
  • Restrict the date of PDB release using the timeline at the bottom
How PLINDER Was Created

Ingest

  • >400k PLI systems across >11k SCOP domains and >50k unique small molecules
  • Paired unbound (apo) and AlphaFold2 predicted structures
  • 500+ annotations, including protein and ligand properties, quality, and more
  • Automated curation pipeline with regular updates

Compare

  • Extensive similarity computation across 14 metrics with >20B scores
  • Including interaction fingerprints, pocket structural similarity and many more

Split

  • Pre-set train/val/test splits that measure generalizability
  • Minimizes leakage and maximizes quality of test
  • Test subsets with novel proteins, pockets, interactions, ligands

Evaluate

  • Retrained existing SOTA methods for comparison
  • Standardized eval harness with CASP-CAPRI compatible metrics
  • Evaluation on holo, apo and predicted structures
  • Leaderboard and Challenges (soon)
Documentation
Upcoming
  • Prediction challenge: PLINDER to be used for PLI challenge at the 2024 NeurIPS ML in Structural Biology (MLSB) workshop
  • More data: Measured and predicted binding affinities
  • More data: Cryptic pocket and promiscuous ligand annotations
  • More data: Data augmentation strategies, including Van-Der-Mers & Cross-Docking, minimized Holo structures to expand Apo coverage
  • Leaderboard: Across different tasks and use-cases with multiple upcoming state-of-the-art works already adopted PLINDER
  • Regular updates: Enabled through extensive metrics, data ingestion is fully automated and will be updated at regular intervals