|Abstract||Despite mounting evidence that data drift causes deep learning models to deteriorate over time, the majority of medical imaging research is developed for - and evaluated on - static close-world environments. There have been exciting advances in the automatic detection and segmentation of diagnostically-relevant findings. Yet the few studies that attempt to validate their performance in actual clinics are met with disappointing results and little utility as perceived by healthcare professionals. This is largely due to the many factors that introduce shifts in medical image data distribution, from changes in the acquisition practices to naturally occurring variations in the patient population and disease manifestation. If we truly wish to leverage deep learning technologies to alleviate the workload of clinicians and drive forward the democratization of health care, we must move away from close-world assumptions and start designing systems for the dynamic open world.
This entails, first, the establishment of reliable quality assurance mechanisms with methods from the fields of uncertainty estimation, out-of-distribution detection, and domain-aware prediction appraisal. Part I of the thesis summarizes my contributions to this area. I first propose two approaches that identify outliers by monitoring a self-supervised objective or by quantifying the distance to training samples in a low-dimensional latent space. I then explore how to maximize the diversity among members of a deep ensemble for improved calibration and robustness; and present a lightweight method to detect low-quality lung lesion segmentation masks using domain knowledge.
Of course, detecting failures is only the first step. We ideally want to train models that are reliable in the open world for a large portion of the data. Out-of-distribution generalization and domain adaptation may increase robustness, but only to a certain extent. As time goes on, models can only maintain acceptable performance if they continue learning with newly acquired cases that reflect changes in the data distribution. The goal of continual learning is to adapt to changes in the environment without forgetting previous knowledge. One practical strategy to approach this is expansion, whereby multiple parametrizations of the model are trained and the most appropriate one is selected during inference. In the second part of the thesis, I present two expansion-based methods that do not rely on information regarding when or how the data distribution changes.
Even when appropriate mechanisms are in place to fail safely and accumulate knowledge over time, this will only translate to clinical usage insofar as the regulatory framework allows it. Current regulations in the USA and European Union only authorize locked systems that do not learn post-deployment. Fortunately, regulatory bodies are noting the need for a modern lifecycle regulatory approach. I review these efforts, along with other practical aspects of developing systems that learn through their lifecycle, in the third part of the thesis.
We are finally at a stage where healthcare professionals and regulators are embracing deep learning. The number of commercially available diagnostic radiology systems is also quickly rising. This opens up our chance - and responsibility - to show that these systems can be safe and effective throughout their lifespan.|