TY - JOUR
T1 - Prediction of supertype-specific HLA class I binding peptides using support vector machines
AU - Zhang, Guang Lan
AU - Bozic, Ivana
AU - Kwoh, Chee Keong
AU - August, J. Thomas
AU - Brusic, Vladimir
N1 - Funding Information:
This project has been funded in part (GLZ, JTA, and VB) with the USA Federal funds from the NIAID, NIH, Department of Health and Human Services, under Grant No. 5 U19 AI56541 and Contract No. HHSN266200400085C.
PY - 2007/3/30
Y1 - 2007/3/30
N2 - Experimental approaches for identifying T-cell epitopes are time-consuming, costly and not applicable to the large scale screening. Computer modeling methods can help to minimize the number of experiments required, enable a systematic scanning for candidate major histocompatibility complex (MHC) binding peptides and thus speed up vaccine development. We developed a prediction system based on a novel data representation of peptide/MHC interaction and support vector machines (SVM) for prediction of peptides that promiscuously bind to multiple Human Leukocyte Antigen (HLA, human MHC) alleles belonging to a HLA supertype. Ten-fold cross-validation results showed that the overall performance of SVM models is improved in comparison to our previously published methods based on hidden Markov models (HMM) and artificial neural networks (ANN), also confirmed by blind testing. At specificity 0.90, sensitivity values of SVM models were 0.90 and 0.92 for HLA-A2 and -A3 dataset respectively. Average area under the receiver operating curve (AROC) of SVM models in blind testing are 0.89 and 0.92 for HLA-A2 and -A3 datasets. AROC of HLA-A2 and -A3 SVM models were 0.94 and 0.95, validated using a full overlapping study of 9-mer peptides from human papillomavirus type 16 E6 and E7 proteins. In addition, a large-scale experimental dataset has been used to validate HLA-A2 and -A3 SVM models. The SVM prediction models were integrated into a web-based computational system MULTIPRED1, accessible at antigen.i2r.a-star.edu.sg/multipred1/.
AB - Experimental approaches for identifying T-cell epitopes are time-consuming, costly and not applicable to the large scale screening. Computer modeling methods can help to minimize the number of experiments required, enable a systematic scanning for candidate major histocompatibility complex (MHC) binding peptides and thus speed up vaccine development. We developed a prediction system based on a novel data representation of peptide/MHC interaction and support vector machines (SVM) for prediction of peptides that promiscuously bind to multiple Human Leukocyte Antigen (HLA, human MHC) alleles belonging to a HLA supertype. Ten-fold cross-validation results showed that the overall performance of SVM models is improved in comparison to our previously published methods based on hidden Markov models (HMM) and artificial neural networks (ANN), also confirmed by blind testing. At specificity 0.90, sensitivity values of SVM models were 0.90 and 0.92 for HLA-A2 and -A3 dataset respectively. Average area under the receiver operating curve (AROC) of SVM models in blind testing are 0.89 and 0.92 for HLA-A2 and -A3 datasets. AROC of HLA-A2 and -A3 SVM models were 0.94 and 0.95, validated using a full overlapping study of 9-mer peptides from human papillomavirus type 16 E6 and E7 proteins. In addition, a large-scale experimental dataset has been used to validate HLA-A2 and -A3 SVM models. The SVM prediction models were integrated into a web-based computational system MULTIPRED1, accessible at antigen.i2r.a-star.edu.sg/multipred1/.
KW - Human Leukocyte Antigen supertype
KW - Promiscuous binding peptide
KW - Support vector machines
KW - T-cell epitope
UR - http://www.scopus.com/inward/record.url?scp=33847710294&partnerID=8YFLogxK
U2 - 10.1016/j.jim.2006.12.011
DO - 10.1016/j.jim.2006.12.011
M3 - Article
C2 - 17303158
AN - SCOPUS:33847710294
SN - 0022-1759
VL - 320
SP - 143
EP - 154
JO - Journal of Immunological Methods
JF - Journal of Immunological Methods
IS - 1-2
ER -