CasPro-ESM2

A Tool for Identifying Cas Proteins Based on the ESM-2 Protein Language Model

Cas proteins are the core components of the CRISPR-Cas system, playing critical roles in defending against foreign DNA and RNA invasions. Identifying Cas proteins can provide deeper insights into the immune mechanisms of the CRISPR-Cas system and help uncover the functional mechanisms of Cas proteins.

In this study, we developed a computational tool named CasPro-ESM2, which combines the ESM-2 large language model with evolutionary information from protein sequences to identify unknown Cas proteins. Experimental results demonstrate that CasPro-ESM2 outperforms existing models in Cas protein identification, achieving the highest values in metrics such as ACC, SP, SN, and MCC on two different datasets. Furthermore, we deployed this tool on a web server to enable direct access for users.

CasPro-ESM2 Model
Start Predicting