TY - GEN
T1 - Computational models for identifying promiscuous HLA-B7 binders based on information theory and support vector machine
AU - Zhang, Guang Lan
AU - Tong, Joo Chuan
AU - Zhang, Zong Hong
AU - Zheng, Yun
AU - August, J. Thomas
AU - Kwoh, Chee Keong
AU - Brusic, Vladimir
N1 - Copyright:
Copyright 2008 Elsevier B.V., All rights reserved.
PY - 2006
Y1 - 2006
N2 - Computational vaccinology is a developing discipline. To become a standard component in vaccine development, it requires accurate and broadly applicable models of wet-lab experiments. We developed prediction models based on a novel data representation of peptide/MHC interaction and support vector machines (SVM) for prediction of peptides that promiscuously bind to multiple Human Leukocyte Antigen (HLA) alleles belonging to HLA-B7 supertype. 10-fold cross-validation results showed that the area under the receiver operating curve (Aroc) of SVM models is above 0.90. Blind testing results showed that the average Aroc of SVM models is 0.84. A learning approach based on information theory, termed Information Learning Approach, was used for feature selection. Several amino acid positions with high information content have been identified in input 9mer peptides and HLA alleles and were used as input features to SVM. They are position 1, 2, 4, 5, 7, 8, 9 in 9mer peptides and position 45 and 97 in HLA-B7 molecules. Prediction accuracy was improved after feature selection. These positions cover the anchor positions of HLA-B7 alleles, which have important biological roles for successful biding of relevant peptides.
AB - Computational vaccinology is a developing discipline. To become a standard component in vaccine development, it requires accurate and broadly applicable models of wet-lab experiments. We developed prediction models based on a novel data representation of peptide/MHC interaction and support vector machines (SVM) for prediction of peptides that promiscuously bind to multiple Human Leukocyte Antigen (HLA) alleles belonging to HLA-B7 supertype. 10-fold cross-validation results showed that the area under the receiver operating curve (Aroc) of SVM models is above 0.90. Blind testing results showed that the average Aroc of SVM models is 0.84. A learning approach based on information theory, termed Information Learning Approach, was used for feature selection. Several amino acid positions with high information content have been identified in input 9mer peptides and HLA alleles and were used as input features to SVM. They are position 1, 2, 4, 5, 7, 8, 9 in 9mer peptides and position 45 and 97 in HLA-B7 molecules. Prediction accuracy was improved after feature selection. These positions cover the anchor positions of HLA-B7 alleles, which have important biological roles for successful biding of relevant peptides.
KW - Binding peptide
KW - HLA-B7
KW - Information thoery
KW - Support vector machine
KW - Vaccinology
UR - http://www.scopus.com/inward/record.url?scp=46249107122&partnerID=8YFLogxK
U2 - 10.1109/ICBPE.2006.348607
DO - 10.1109/ICBPE.2006.348607
M3 - Conference contribution
AN - SCOPUS:46249107122
SN - 8190426249
SN - 9788190426244
T3 - ICBPE 2006 - Proceedings of the 2006 International Conference on Biomedical and Pharmaceutical Engineering
SP - 319
EP - 323
BT - ICBPE 2006 - Proceedings of the 2006 International Conference on Biomedical and Pharmaceutical Engineering
T2 - ICBPE 2006 - 2006 International Conference on Biomedical and Pharmaceutical Engineering
Y2 - 11 December 2006 through 14 December 2006
ER -