Roshan Parikh, ScM
Icahn School of Medicine at Mount Sinai
"Underrepresentation of darker skin tones introduces systematic bias in machine learning models, contributing to disparities in diagnostic performance and downstream outcomes. Fairness frameworks and bias assessments are crucial to identify racial biases in such applications. Using the Fitzpatrick Skin Tone (FST I-VI) framework, we developed a ResNet50-based classifier for benign versus malignant neoplasms with explicit fairness evaluation.

We analyzed 4,320 Fitzpatrick17k images labeled by FST, grouped as light (I-III) and dark (IV-VI) due to limited Type V-VI representation. Train, validation, and test splits were stratified by malignancy and FST prior to augmentation. Test accuracy was 0.78 (n=432). Performance was comparable across groups, with slightly higher metrics in darker skin (n=108) versus lighter skin (n=324). Fairness metrics showed modest group differences: independence 0.0617, separation 0.0847, sufficiency 0.0573, and disparate impact 0.871. Sensitivity was nearly identical (difference=0.006), with separation gaps driven by false positives rather than missed malignancies.

Compared with prior neoplasm classification studies reporting lower parity, our model demonstrates improved equity across skin tones. These findings support fairness-aware training and evaluation as practical strategies for reducing algorithmic bias in clinical AI."
Roshan Parikh, ScM