Big self-supervised models advance medical image classification

S Azizi, B Mustafa, F Ryan, Z Beaver… - Proceedings of the …, 2021 - openaccess.thecvf.com
Proceedings of the IEEE/CVF international conference on …, 2021openaccess.thecvf.com
Self-supervised pretraining followed by supervised fine-tuning has seen success in image
recognition, especially when labeled examples are scarce, but has received limited attention
in medical image analysis. This paper studies the effectiveness of self-supervised learning
as a pretraining strategy for medical image classification. We conduct experiments on two
distinct tasks: dermatology condition classification from digital camera images and multi-
label chest X-ray classification, and demonstrate that self-supervised learning on ImageNet …
Abstract
Self-supervised pretraining followed by supervised fine-tuning has seen success in image recognition, especially when labeled examples are scarce, but has received limited attention in medical image analysis. This paper studies the effectiveness of self-supervised learning as a pretraining strategy for medical image classification. We conduct experiments on two distinct tasks: dermatology condition classification from digital camera images and multi-label chest X-ray classification, and demonstrate that self-supervised learning on ImageNet, followed by additional self-supervised learning on unlabeled domain-specific medical images significantly improves the accuracy of medical image classifiers. We introduce a novel Multi-Instance Contrastive Learning (MICLe) method that uses multiple images of the underlying pathology per patient case, when available, to construct more informative positive pairs for self-supervised learning. Combining our contributions, we achieve an improvement of 6.7% in top-1 accuracy and an improvement of 1.1% in mean AUC on dermatology and chest X-ray classification respectively, outperforming strong supervised baselines pretrained on ImageNet. In addition, we show that big self-supervised models are robust to distribution shift and can learn efficiently with a small number of labeled medical images.
openaccess.thecvf.com