B i o A I L a b

Welcome To BioAi−Lab

Introduction

Tuberculosis has plagued mankind since ancient times, and the struggle between humans and tuberculosis has not yet ended. Mycobacterium tuberculosis is the leading cause of tuberculosis, infecting nearly one-third of the world's population. In recent years, due to the gradual emergence of extremely drug-resistant, completely drug-resistant and multi-drug resistant strains, the situation has become more severe, and it is urgent to develop new treatments. The rise of peptide drugs has led to a new direction in the treatment of tuberculosis. Its low immunogenicity, selective affinity for bacterial negatively charged cell membranes and different mechanisms of action play a vital role in the treatment of tuberculosis. Therefore, for the treatment of tuberculosis, the prediction of anti-tuberculosis peptides becomes crucial. This paper proposes an anti-tuberculosis peptide prediction method based on hybrid features and stacked ensemble learning. First, Random Forest (RF) and Extremely randomized tree (ERT) are selected from five classifiers as the first-level learning of stacked ensembles. Then select the five best-performing feature encoding methods from the seven feature encoding methods to perform feature mixing to obtain the hybrid feature vector, and then use the decision tree and recursive feature elimination (DT-RFE) to refine the hybrid feature vector, the optimal feature subset after selection is used as the input of the stacked ensemble model. At the same time, Logistic Regression (LR) is used as a stacked ensemble secondary learner to build the final stacked ensemble model Hyb_SEnc. The prediction accuracy of Hyb_SEnc reached 94.68% and 95.74% on the independent test sets of AntiTb_MD and AntiTb_RD, respectively. This research finally obtains two different Hyb_SEnc integrated models through two different data sets(AntiTb_MD,AntiTb_RD), and the user can choose one of these two models for classification.



Framework