EN
EN CN

Home Technology Platform Independent Technology GPDL
GPDL
The first step in de novo protein design is building the backbone. Traditional methods rely on structure prediction networks such as RoseTTAFold, which are computationally intensive, slow, and have limited accuracy in preserving functional sites. Industry needs a lighter, more accurate, and more diverse backbone generation tool.

GPDL (Generative Protein Design by Language model) provides the answer. Based on the ESM2 protein language model and the ESMMFold structure prediction network, it completely eliminates the reliance on multiple sequence alignment (MSA) and natural templates. Unlike RFdiffusion, which uses a diffusion model approach, GPDL employs a two-step strategy of "structure seeding + MCMC optimization": first, it generates an initial structure using the sequence and distance matrix of the ESM-IF1 encoding functional motif; then, through iterative optimization, it generates a novel scaffold while preserving the accuracy of functional sites.

img.png


Scaffold-Lab benchmarks cover 24 standard functional motif design tasks. GPDL successfully solved 22 cases, achieving the highest success rate in the industry. The number of unique designable structural clusters is 33.5% higher than RFdiffusion. Due to the use of ESMFold instead of RoseTTAFold at the underlying level, the generation speed is improved by 10–20 times. The model provides both inpainting and hallucination modes, covering α-helical, β-fold, and α/β hybrid topologies. After self-consistent verification, the designed skeleton has a motif RMSD of < 1 Å.

What does this mean for industry? The cycle time for de novo design of functional proteins has been significantly shortened. The development of enzyme active site scaffolds, metal-binding proteins, and drug delivery microproteins no longer requires lengthy structural screening. GPDL makes the precise construction "from functional site to entirely new scaffold" a routine operation.

This finding was published in the International Journal of Biological Macromolecules (2025), and the code has been open-sourced.

paper:https://doi.org/10.1016/j.ijbiomac.2025.148441
GitHub:https://github.com/sirius777coder/GPDL
Patent: CN117497040A

Online message

  • Message content