TY - GEN
T1 - Application of Genetic Search in Derivation of Matrix Models of Peptide Binding to MHC Molecules
AU - Brusic, Vladimir
AU - Schönbach, Christian
AU - Takiguchi, Masafumi
AU - Ciesielski, Vic
AU - Harrison, Leonard C.
N1 - Publisher Copyright:
Copyright © 1997, AAAI (www.aaai.org). All rights reserved.
PY - 1997
Y1 - 1997
N2 - T cells of the vertebrate immune system recognise peptides bound by major histocompatibility complex (MHC) molecules on the surface of host cells. Peptide binding to MHC molecules is necessary for immune recognition, but only a subset of peptides are capable of binding to a particular MHC molecule. Common amino acid patterns (binding motifs) have been observed in sets of peptides that bind to specific MHC molecules. Recently, matrix models for peptide/MHC interaction have been reported. These encode the rules of peptide/ MHC interactions for an individual MHC molecule as a 20 9 matrix where the contribution to binding of each amino acid at each position within a 9-mer peptide is quantified. The artificial intelligence techniques of genetic search and machine learning have proved to be very useful in the area of biological sequence analysis. The availability of peptide/MHC binding data can facilitate derivation of binding matrices using machine learning techniques. We performed a simulation study to determine the minimum number of peptide samples required to derive matrices, given the pre-defined accuracy of the matrix model. The matrices were derived using a genetic search. In addition, matrices for peptide binding to the human class I MHC molecules, HLA-B35 and -A24, were derived, validated by independent experimental data and compared to previously-reported matrices. The results indicate that at least 150 peptide samples are required to derive matrices of acceptable accuracy. This result is based on a maximum noise content of 5%, the availability of precise affinity measurements and that acceptable accuracy is determined by an area under the Relative Operating Characteristic curve (Aroc) of >0.8. More than 600 peptide samples are required to derive matrices of excellent accuracy (Aroc>0.9). Finally, we derived a human HLA-B27 binding matrix using a genetic search and 404 experimentally-tested peptides, and estimated its accuracy at Aroc>0.88. The results of this study are expected to be of practical interest to immunologists for efficient identification of peptides as candidates for immunotherapy.
AB - T cells of the vertebrate immune system recognise peptides bound by major histocompatibility complex (MHC) molecules on the surface of host cells. Peptide binding to MHC molecules is necessary for immune recognition, but only a subset of peptides are capable of binding to a particular MHC molecule. Common amino acid patterns (binding motifs) have been observed in sets of peptides that bind to specific MHC molecules. Recently, matrix models for peptide/MHC interaction have been reported. These encode the rules of peptide/ MHC interactions for an individual MHC molecule as a 20 9 matrix where the contribution to binding of each amino acid at each position within a 9-mer peptide is quantified. The artificial intelligence techniques of genetic search and machine learning have proved to be very useful in the area of biological sequence analysis. The availability of peptide/MHC binding data can facilitate derivation of binding matrices using machine learning techniques. We performed a simulation study to determine the minimum number of peptide samples required to derive matrices, given the pre-defined accuracy of the matrix model. The matrices were derived using a genetic search. In addition, matrices for peptide binding to the human class I MHC molecules, HLA-B35 and -A24, were derived, validated by independent experimental data and compared to previously-reported matrices. The results indicate that at least 150 peptide samples are required to derive matrices of acceptable accuracy. This result is based on a maximum noise content of 5%, the availability of precise affinity measurements and that acceptable accuracy is determined by an area under the Relative Operating Characteristic curve (Aroc) of >0.8. More than 600 peptide samples are required to derive matrices of excellent accuracy (Aroc>0.9). Finally, we derived a human HLA-B27 binding matrix using a genetic search and 404 experimentally-tested peptides, and estimated its accuracy at Aroc>0.88. The results of this study are expected to be of practical interest to immunologists for efficient identification of peptides as candidates for immunotherapy.
KW - MHC
KW - application specific modeling
KW - classification
KW - genetic search
KW - machine learning
KW - major histocompatibility complex
KW - motif
KW - patterns
KW - peptide binding
UR - http://www.scopus.com/inward/record.url?scp=85166316785&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85166316785
T3 - Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, ISMB 1997
BT - Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology, ISMB 1997
PB - AAAI Press
T2 - 5th International Conference on Intelligent Systems for Molecular Biology, ISMB 1997
Y2 - 21 June 1997 through 25 June 1997
ER -