T Cell Epitopes - MHC Class I Binding Prediction Tools Description

lindy · October 16, 2024, 6:54pm

The MHC class I binding prediction tools can be found at MHC-I Binding, and a tutorial can be found at MHC-I Help.

A RESTful interface is also available for MHC class I and class II prediction tools. This allows users to perform predictions on the IEDB server in batch mode without having to install any software on their own systems. Additionally, users are always assured that they are using the latest version of the tools.

Peptide Binding to MHC Class I Molecules

Users can select from nine different methods for predicting class I epitopes – ANN, SMM, SMMPMBEC, Comblib_Sidney2008, Consensus, NetMHCpan (which includes NetMHCpan EL and NetMHCpan BA), PickPocket, NetMHCcons, and NetMHCstabpan, which are described further below. A check box can be selected to show only frequently occurring alleles. This allows the selection of only those alleles that occur in at least 1% of the human population or allele frequency of 1% or higher. However, un-checking the check-box will allow selection of all the alleles and corresponding peptide lengths for a particular species. Users can also upload an allele file instead of entering allele on the page one at a time.

NetMHCpan, NetMHCstabpan, PickPocket, and NetMHCcons were developed in collaboration with the DTU Department of Health Technology and are also hosted here.

Artificial Neural Network

Artificial neural networks (ANN) are computer algorithms modeled after the brain. They consist of many simple processing units which are wired together in a communication network. Each unit is a simplified model of a neuron which sends off a new signal if it receives a sufficiently strong input signal from the other units to which it is connected. The strength of these connections can be varied in order for the network to perform a desired pattern of node signal activity, which is learned from a set of input training data. The training data in this case are peptide sequences with quantitative affinities for a specific MHC molecule.

Many different implementations of artificial neural networks exist. The one utilized here is described for HLA-A2 binding predictions by Nielsen et al. (Protein Science, 2003, PMID 12717023.pdf (1020.5 KB)) and has been applied to a number of different alleles (NetMHC 4.0 - DTU Health Tech - Bioinformatic Services).

Stabilized Matrix Method (SMM)

The Stabilized Matrix Method (SMM) described by Peters and Sette (BMC Bioinformatics, 2005, PMID 15927070.pdf (279.5 KB)) can be applied to calculate matrices from quantitative affinity data of peptides binding to MHC molecules. The advantage of this method is that it suppresses the noise present in the training data, caused by the inevitable experimental error as well as the limited number of data points.

Stabilized matrix Method with a Peptide:MHC Binding Energy Covariance matrix (SMMPMBEC)

SMMPMBEC is an improved version of SMM. It is different from SMM in that it addresses sparseness of peptide sequence coverage that is often found in binding data sets by using Peptide:MHC Binding Energy Covariance (PMBEC). The PMBEC matrix was derived from experimentally determined binding affinity measurements using combinatorial peptide libraries. SMMPMBEC is described in Kim et al. BMC Bioinformatics 2009, PMID 19948066.pdf (941.8 KB)

Scoring matrices derived from combinatorial peptide libraries (Comblib_Sidney2008)

Comblib_Sidney2008 refers to a set of predictors (i.e. scoring matrices) that were derived from binding affinity measurements of combinatorial peptide libraries against a panel of MHC alleles. This work is described in Sidney et al. Immunome Res. 2008, PMID 18221540.pdf (333.9 KB). This class of predictors is unique in that average binding energy contribution of a given residue at a position is directly measured, without worrying about limited peptide sequence coverage.

Consensus

The Consensus predictor was motivated by an idea that predictions made by consulting “consensus” of individual predictions from multiple predictors may result in improved performance over that of any individual ones. For MHC-I, a work describing an early implementation can be found in Moutaftsi M et al. Nat Biotech 2006, PMID 16767078.pdf (164.2 KB). The methods used for Consensus are ANN, SMM, and CombLib_Sidney2008. The Consensus method uses as many of these three component methods as possible, depending on their applicability for the chosen allele and length.

NetMHCpan

NetMHCpan, which predicts binding of peptides to a MHC class I molecule using artificial neural networks (ANN), is the default prediction method selection, and is what is recommended by the IEDB based on the availability of predictors and the observed predicted performance of a given allele. Currently, this is NetMHCpan 4.1 EL. It predicts binding for over 1,650 alleles, including HLA-A, B, C, E, G; non-human primates; mouse; pig; and user-supplied MHC sequence. Predictions can be made for peptide sequences of 8 to 11 residues in length. The method has been trained on over 110,000 peptide/MHC interactions. A paper describing the NetMHCpan method was published by Nielsen et al. in Nucleic Acids Res, July 2020 and can be found here: 32406916.pdf (1.9 MB)

PickPocket

PickPocket server predicts binding of peptides to any known MHC molecule using positiion specific weight matrices. The method is trained on more than 150,000 quantitative binding data covering more than 150 different MHC molecules. Predictions can be made for HLA-A, B, C, E and G alleles, as well as for non-human primates, mouse, Cattle and pig. Further, the user can upload full length MHC protein sequences, and have the server predict MHC restricted peptides from any given protein of interest. This method is described by Zhang et al. in Bioinformatics, May 2009 and can be found here: 19297351.pdf (508.0 KB)

NetMHCcons

The NetMHCcons 1.1 server predicts binding of peptides to any known MHC class I molecule. This is a consensus method for MHC class I predictions integrating NetMHC, NetMHCpan, and PickPocket to give the most accurate predictions. The server also allows users to use each of these methods separately. It is described by Karosiene et al. in Immunogenetics, March 2012 here: 22009319.pdf (428.7 KB)

NetMHCstabpan

NetMHCstabpan predicts binding stability of peptides to any known MHC molecule using artificial neural networks (ANNs). The method is trained on more than 25,000 quantitative stability data covering 75 different HLA molecules. The user can upload full length MHC protein sequences, and have the server predict MHC restricted peptides from any given protein of interest. The method is described by Rasmussen et al. in J Immunol, August 2016 and can be found here: 27402703.pdf (1.2 MB)