T Cell Epitopes - MHC Class II Binding Prediction Tools Description

lindy · October 16, 2024, 7:07pm

The MHC class II binding prediction tools can be found at MHC-II Binding, and a tutorial can be found at MHC-II Help.

A RESTful interface is also available for MHC class I and class II prediction tools. This allows users to perform predictions on the IEDB server in batch mode without having to install any software on their own systems. Additionally, users are always assured that they are using the latest version of the tools.

Peptide Binding to MHC Class II Molecules

Users can select from seven different methods for predicting class II epitopes – IEDB recommended, SMM-align, Sturniolo, Combinatorial Library, Consensus, NN-align, and NetMHCpanII, which includes NetMHCIIpan EL and NetMHCIIpan BA. By default, NetMHCIIpan is selected. However, not all methods can currently make predictions for all alleles, so only the alleles available will be displayed. The seven methods are described further below.

SMM-align

The MHC class II binding groove is open at both ends making the correct alignment of a peptide in the binding groove a crucial part of identifying the core of an MHC class II binding motif. The stabilization matrix alignment method, SMM-align, allows for direct prediction of peptide:MHC binding affinities. The method uses amino terminal peptide flanking residues (PFR) to get a consistent gain in predictive performance by favoring binding registers with a minimum PFR length of two amino acids. The method predicts quantitative peptide:MHC binding affinity values. The method has been trained and evaluated on a data set that covers the nine HLA-DR supertypes suggested and three mouse H2-IA allele. The method is described by Nielsen et al. (BMC Bioinformatics, 2007) here:
17608956.pdf (796.9 KB).

Tepitope (Sturniolo)

This matrix-based approach is used in the TEPITOPE class II epitope prediction program. It is described in Sturniolo et al. (Nat. Biotechnol., 1999) here: 10385319.pdf (1.0 MB).

Combinatorial Library

The positional scanning combinatorial libraries approach utilized a pool of random peptide libraries to systematically measure the contribution to MHC binding from each amino acid at each of the nine positions at the binding peptide. Each pool in the library contains 9-mer peptides with one fixed residue at a single position. With each of the 20 naturally occurring residues represented at each position along the 9-mer backbone, the entire library consisted of 180 peptide mixtures. Competitive binding assays were then carried out to determine the IC50 values for each pool. IC50 values for each mixture were standardized as a ratio to the geometric mean IC50 value of the entire set of 180 mixtures, and then normalized at each position so that the value associated with the optimal value at each position corresponds to 1. For each position, an average (geometric) relative binding affinity (ARB) was calculated, and then the ratio of the ARB for the entire library to the ARB for each position was derived. The final results are a set of 9 by 20 scoring matrices which could predict the binding of novel peptides to MHC molecules.

A paper specifically describing the Cominatorial Library for MHC Class II can be found here:
21092157.pdf (571.1 KB)

Consensus

The consensus method was developed by the IEDB team by exploiting features of the other three aforementioned methods. The method was updated with the introduction of NN-align, so the revised Consesnus method uses NN-align, SMM-align, and the combinatorial peptide scanning library. When the scanning library is not available for an allele, the Sturniolo method is used instead. A paper describing the original method was published by Wang et al. (PLoS Comput Biol, 2008) here: 18389056.pdf (193.9 KB).

NN-align

NN-align is an artificial neural network-based alignment algorithm for MHC class II peptide binding prediction. It simultaneously identifies the MHC class II binding core and binding affinity. The method is trained using an algorithm that corrects bias in the training data caused by redundancy in binding core representation. Prediction accuracy has been shown to improve significantly when information about the residues flanking the peptide-binding core is taken into account. A 2009 paper in BMC Bioinformatics by Nielsen and Lund describe the method in detail (PMID: 19765293.pdf (784.2 KB)).

NetMHCIIpan

NetMHCIIpan, which predicts binding of peptides for over 500 HLA-DR alleles using artificial neural networks, is the default prediction method selection, and is what is recommended by the IEDB based on the availability of predictors and the observed predicted performance of a given allele. Currently, this is NetMHCIIpan 4.1 EL. A paper describing the NetMHCIIpan method was published by Nielsen et al. in Nucleic Acids Res, July 2020 and can be found here: 32406916.pdf (1.9 MB).

The datasets used in assessing the performance of the SMM-align and Tepitope methods, and in developing the Consensus method as described in Wang et al., can be found on the IEDB Analysis Resource Datasets page. The three datasets in the MHC class II binding prediction dropdown menu can be used for developing algorithms that predict peptides binding to MHC class II molecules and/or activating CD4+ T cells. The first is a comprehensive dataset consisting of more than 10,000 previously unpublished MHC-peptide binding affinities for 16 alleles (peptide_affinity_dataset.zip). The second dataset is a text file of 29 peptide/MHC crystal structures found in the PDB that can be used for binding core predictions (non_redundant_pdb_core_pep_allele.txt). The third dataset contains 664 peptide sequences experimentally tested for CD4+ T-cell responses (LCMV_T_cell_activation.txt).