A comprehensive analysis of the IEDB MHC class-I automated benchmark.
Trevizani, R., Yan, Z., Greenbaum, J. A., Sette, A., Nielsen, M. and Peters, B.
Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, California 92037, USA.
Fiocruz Ceara, Fundacao Oswaldo Cruz, Rua Sao Jose s/n, Precabura, Eusebio/CE, Brazil.
Bioinformatics Core, La Jolla Institute for Immunology, La Jolla, California 92037, USA.
Department of Medicine, University of California San Diego, La Jolla, California 92093, USA.
Department of Health Technology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
Instituto de Investigaciones Biotecnologicas, Universidad Nacional de San Martin, B1650 Buenos Aires, Argentina.
In 2014, the Immune Epitope Database automated benchmark was created to compare the performance of the MHC class I binding predictors. However, this is not a straightforward process due to the different and non-standardized outputs of the methods. Additionally, some methods are more restrictive regarding the HLA alleles and epitope sizes for which they predict binding affinities, while others are more comprehensive. To address how these problems impacted the ranking of the predictors, we developed an approach to assess the reliability of different metrics. We found that using percentile-ranked results improved the stability of the ranks and allowed the predictors to be reliably ranked despite not being evaluated on the same data. We also found that given the rate new data are incorporated into the benchmark, a new method must wait for at least 4 years to be ranked against the pre-existing methods. The best-performing tools with statistically indistinguishable scores in this benchmark were NetMHCcons, NetMHCpan4.0, ANN3.4, NetMHCpan3.0 and NetMHCpan2.8. The results of this study will be used to improve the evaluation and display of benchmark performance. We highly encourage anyone working on MHC binding predictions to participate in this benchmark to get an unbiased evaluation of their predictors.
Briefings in Bioinformatics 23(4): en prensa (2022)