AI Model CleaveNet Enables Tailored Design of Protease Substrates

A new deep learning platform may transform how researchers design peptide substrates for proteases, enzymes with key roles in health and disease. Published inNature Communications, the study presents CleaveNet, an artificial intelligence (AI) pipeline capable of generating synthetic substrates with customized cleavage profiles, offering a scalable and efficient alternative to traditional experimental approaches.
Designing substrates for proteases—used in diagnostics, therapeutics, and biological studies—has long been limited by the vast combinatorial space of possible sequences and the difficulty of achieving selectivity for specific enzymes. The authors focused on matrix metalloproteinases (MMPs), a family of 18 human proteases implicated in cancer metastasis and tissue remodeling, and trained CleaveNet on a large dataset of over 18,000 peptides previously screened using mRNA display.
CleaveNet comprises two core components: a Predictor that assigns cleavage likelihood scores across the MMP panel, and a Generator that creates candidate sequences. The generator can be used in both unconditional and conditional modes, the latter allowing users to specify desired cleavage characteristics such as enzyme selectivity.
The AI-generated substrates were benchmarked against experimental datasets and showed strong predictive accuracy. In vitro testing confirmed that all 24 sequences designed for MMP13—an MMP linked to osteoarthritis and tumor progression—were successfully cleaved by the enzyme. Some designs demonstrated higher efficiency than any substrate in the original training set. Moreover, the conditional generation approach yielded sequences with both high efficiency and high selectivity for MMP13, an outcome previously unattainable using experimental methods alone.
The study also revealed previously underappreciated amino acid motifs important for MMP cleavage, such as methionine at position P4, and identified substrates that mirrored phylogenetic relationships among MMPs.
While CleaveNet currently applies only to synthetic substrates and has been validated primarily for MMPs, the authors argue that its open-source design could extend to other enzyme families. They envision the tool accelerating protease research and enabling more precise engineering of activity-based diagnostics and therapeutics.