Dana-Farber repository for machine learning in immunology

Guang Lan Zhang; Hong Huang Lin; Derin B. Keskin; Ellis L. Reinherz; Vladimir Brusic

doi:10.1016/j.jim.2011.07.007

Dana-Farber repository for machine learning in immunology

Guang Lan Zhang, Hong Huang Lin, Derin B. Keskin, Ellis L. Reinherz, Vladimir Brusic

Research output: Journal Publication › Article › peer-review

32 Citations (Scopus)

Abstract

The immune system is characterized by high combinatorial complexity that necessitates the use of specialized computational tools for analysis of immunological data. Machine learning (ML) algorithms are used in combination with classical experimentation for the selection of vaccine targets and in computational simulations that reduce the number of necessary experiments. The development of ML algorithms requires standardized data sets, consistent measurement methods, and uniform scales. To bridge the gap between the immunology community and the ML community, we designed a repository for machine learning in immunology named Dana-Farber Repository for Machine Learning in Immunology (DFRMLI). This repository provides standardized data sets of HLA-binding peptides with all binding affinities mapped onto a common scale. It also provides a list of experimentally validated naturally processed T cell epitopes derived from tumor or virus antigens. The DFRMLI data were preprocessed and ensure consistency, comparability, detailed descriptions, and statistically meaningful sample sizes for peptides that bind to various HLA molecules. The repository is accessible at http://bio.dfci.harvard.edu/DFRMLI/.

Original language	English
Pages (from-to)	18-25
Number of pages	8
Journal	Journal of Immunological Methods
Volume	374
Issue number	1-2
DOIs	https://doi.org/10.1016/j.jim.2011.07.007
Publication status	Published - 30 Nov 2011
Externally published	Yes

Keywords

Data repository
HLA binding
Immune system
Mathematical model
Prediction
T cell epitope

ASJC Scopus subject areas

Immunology and Allergy
Immunology

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.jim.2011.07.007

Cite this

@article{dee3ee66c023484eb22565b05d049e5c,

title = "Dana-Farber repository for machine learning in immunology",

abstract = "The immune system is characterized by high combinatorial complexity that necessitates the use of specialized computational tools for analysis of immunological data. Machine learning (ML) algorithms are used in combination with classical experimentation for the selection of vaccine targets and in computational simulations that reduce the number of necessary experiments. The development of ML algorithms requires standardized data sets, consistent measurement methods, and uniform scales. To bridge the gap between the immunology community and the ML community, we designed a repository for machine learning in immunology named Dana-Farber Repository for Machine Learning in Immunology (DFRMLI). This repository provides standardized data sets of HLA-binding peptides with all binding affinities mapped onto a common scale. It also provides a list of experimentally validated naturally processed T cell epitopes derived from tumor or virus antigens. The DFRMLI data were preprocessed and ensure consistency, comparability, detailed descriptions, and statistically meaningful sample sizes for peptides that bind to various HLA molecules. The repository is accessible at http://bio.dfci.harvard.edu/DFRMLI/.",

keywords = "Data repository, HLA binding, Immune system, Mathematical model, Prediction, T cell epitope",

author = "Zhang, {Guang Lan} and Lin, {Hong Huang} and Keskin, {Derin B.} and Reinherz, {Ellis L.} and Vladimir Brusic",

note = "Funding Information: This work was supported by NIH grants U19AI57330 and 470 U01AI90043 and a grant from DOD W81XWH-07-1-0080 . We are thankful to Dr Songsak Tongchusak for providing a list of T cell epitopes that was included in DFRMLI and Tara C. Mayo for thoughtful review of the manuscript. ",

year = "2011",

month = nov,

day = "30",

doi = "10.1016/j.jim.2011.07.007",

language = "English",

volume = "374",

pages = "18--25",

journal = "Journal of Immunological Methods",

issn = "0022-1759",

publisher = "Elsevier B.V.",

number = "1-2",

}

TY - JOUR

T1 - Dana-Farber repository for machine learning in immunology

AU - Zhang, Guang Lan

AU - Lin, Hong Huang

AU - Keskin, Derin B.

AU - Reinherz, Ellis L.

AU - Brusic, Vladimir

N1 - Funding Information: This work was supported by NIH grants U19AI57330 and 470 U01AI90043 and a grant from DOD W81XWH-07-1-0080 . We are thankful to Dr Songsak Tongchusak for providing a list of T cell epitopes that was included in DFRMLI and Tara C. Mayo for thoughtful review of the manuscript.

PY - 2011/11/30

Y1 - 2011/11/30

N2 - The immune system is characterized by high combinatorial complexity that necessitates the use of specialized computational tools for analysis of immunological data. Machine learning (ML) algorithms are used in combination with classical experimentation for the selection of vaccine targets and in computational simulations that reduce the number of necessary experiments. The development of ML algorithms requires standardized data sets, consistent measurement methods, and uniform scales. To bridge the gap between the immunology community and the ML community, we designed a repository for machine learning in immunology named Dana-Farber Repository for Machine Learning in Immunology (DFRMLI). This repository provides standardized data sets of HLA-binding peptides with all binding affinities mapped onto a common scale. It also provides a list of experimentally validated naturally processed T cell epitopes derived from tumor or virus antigens. The DFRMLI data were preprocessed and ensure consistency, comparability, detailed descriptions, and statistically meaningful sample sizes for peptides that bind to various HLA molecules. The repository is accessible at http://bio.dfci.harvard.edu/DFRMLI/.

AB - The immune system is characterized by high combinatorial complexity that necessitates the use of specialized computational tools for analysis of immunological data. Machine learning (ML) algorithms are used in combination with classical experimentation for the selection of vaccine targets and in computational simulations that reduce the number of necessary experiments. The development of ML algorithms requires standardized data sets, consistent measurement methods, and uniform scales. To bridge the gap between the immunology community and the ML community, we designed a repository for machine learning in immunology named Dana-Farber Repository for Machine Learning in Immunology (DFRMLI). This repository provides standardized data sets of HLA-binding peptides with all binding affinities mapped onto a common scale. It also provides a list of experimentally validated naturally processed T cell epitopes derived from tumor or virus antigens. The DFRMLI data were preprocessed and ensure consistency, comparability, detailed descriptions, and statistically meaningful sample sizes for peptides that bind to various HLA molecules. The repository is accessible at http://bio.dfci.harvard.edu/DFRMLI/.

KW - Data repository

KW - HLA binding

KW - Immune system

KW - Mathematical model

KW - Prediction

KW - T cell epitope

UR - http://www.scopus.com/inward/record.url?scp=81255136087&partnerID=8YFLogxK

U2 - 10.1016/j.jim.2011.07.007

DO - 10.1016/j.jim.2011.07.007

M3 - Article

C2 - 21782820

AN - SCOPUS:81255136087

SN - 0022-1759

VL - 374

SP - 18

EP - 25

JO - Journal of Immunological Methods

JF - Journal of Immunological Methods

IS - 1-2

ER -

Dana-Farber repository for machine learning in immunology

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this