Jonathan Levine, BA
Memorial Sloan Kettering Cancer Center
Biological and artificial networks routinely make reliable distinctions between similar inputs, and the rules for making these distinctions are learned. In some ways, self/nonself discrimination in the immune system is similar, being both reliable and (partly) learned through thymic selection. In contrast to other examples, we show that the distributions of self- and nonself-peptides are nearly identical but strongly inhomogeneous. Reliable discrimination is possible only because self-peptides are a particular finite sample drawn from this distribution, and T cells can target the “spaces” between these samples. In conventional learning problems, this would constitute memorization or overfitting and lead to disaster. Here, the strong inhomogeneities imply instead that the immune system gains by targeting peptides that are similar to self, with maximum sensitivity for sequences just one or two substitutions away. This model of the structure of the underlying distribution in sequence space predicts, for example, the observed ability of the immune system to respond to mutation-derived cancer neoantigens.
