Prediction of protein subcellular localization of human protein using j48, random forest and best first tree techniques
DOI:
https://doi.org/10.46947/joaasr112201791Keywords:
subcellular localization, classification tree, human proteins, physicochemical properties.Abstract
Functional explication of unascertained proteins is a remarkable achievement in proteomics. Proteins subcellular localization serves as the key annotation. Many prediction techniques were developed emphasizing on an individual biological point or speculating a subset of all localizations. Emulating the protein localization that is studied pivotal is carried out by gathering all the necessary biological relevant information and addressing the necessity of improving the prediction accuracy. Proteins carry an obligatory role in a wide range of bioprocess such as catalysis of biochemical reaction, signal transduction and are requisite for cellular processes. They execute the associated functions could be analyzed by predicting their associated cellular locations. The colonization of the proteins could be scrutinized by considering the features of primary sequence of protein such as physiochemical and amino acid composition of the complete protein. The C-terminal and N-terminal physiochemical composition and other physicochemical properties of the primary sequence also contribute for the subcellular localization. In this paper, the computational technique, J48, best first decision tree, random forest are employed for the localization prediction has shown significant performance over several other techniques. The integrated latest database are trained with obsolete data and three techniques were employed for studying the subcellular localization which documents the increase in the accuracy of the prediction, by 87.711 % with J48, 81.67% with random forest, and 88.125% with BF Tree based on the features discussed by comparing our techniques over others.
Metrics
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.