The Allure of Simplicity

On Interpretable Machine Learning Models in Healthcare


  • Thomas Grote Ethics and Philosophy Lab; Cluster of Excellence: Machine Learning—New Perspectives for Science, University of Tübingen, Tübingen, Germany



Explainable AI, Reliabilism, Machine learning, Medicine, Interpretability, Opacity


This paper develops an account of the opacity problem in medical machine learning (ML). Guided by pragmatist assumptions, I argue that opacity in ML models is problematic insofar as it potentially undermines the achievement of two key purposes: ensuring generalizability and optimizing clinician–machine decision-making. Three opacity amelioration strategies are examined, with explainable artificial intelligence (XAI) as the predominant approach, challenged by two revisionary strategies in the form of reliabilism and the interpretability by design. Comparing the three strategies, I argue that interpretability by design is most promising to overcome opacity in medical ML. Looking beyond the individual opacity amelioration strategies, the paper also contributes to a deeper understanding of the problem space and the solution space regarding opacity in medical ML.



Adebayo, Julius, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. 2018. “Sanity Checks for Saliency Maps.” Advances in Neural Information Processing Systems 31, edited by S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett.

Arrieta, Alejandro Barredo, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador Garcia, et al. 2020. “Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI.” Information Fusion 58: 82–115.

Arun, Nishanth, Nathan Gaw, Praveer Singh, Ken Chang, Mehak Aggarwal, Bryan Chen, Katharina Hoebel, et al. 2021.“Assessing the Trustworthiness of Saliency Maps for Localizing Abnormalities in Medical Imaging.” Radiology: Artificial Intelligence 3, no. 6.

Ayhan, Murat Seçkin, Louis Benedikt Kümmerle, Laura Kühlewein, Werner Inhoffen, Gulnar Aliyeva, Focke Ziemssen, and Philipp Berens. 2022. “Clinical Validation of Saliency Maps for Understanding Deep Neural Networks in Ophthalmology.” Medical Image Analysis 77, art. 102364.

Babic, Boris, Sara Gerke, Theodoros Evgeniou, and I. Glenn Cohen. 2021. “Beware Explanations from AI in Health Care.” Science , no. 6552: 284–286.

Barnett, Alina Jade, Fides Regina Schwartz, Chaofan Tao, Chaofan Chen, Yinhao Ren, Joseph Y. Lo, and Cynthia Rudin. 2021. “A Case-Based Interpretable Deep Learning Model for Classification of Mass Lesions in Digital Mammography.” Nature Machine Intelligence 3, no. 12: 1061–1070.

Beede, Emma, Elizabeth Baylor, Fred Hersch, Anna Iurchenko, Lauren Wilcox, Paisan Ruamviboonsuk, and Laura M. Vardoulakis. 2020. “A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy”. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–12. Association for Computing Machinery.

Bjerring, Jens Christian and Jacob Busch. 2021. “Artificial Intelligence and Patient-Centered Decision-Making.” Philosophy & Technology 34, no. 2: 349–371.

Brendel, Wieland and Matthias Bethge. 2019. “Approximating CNNs with Bag-of-Local-Features Models Works Surprisingly Well on ImageNet.” arXiv preprint, 1904.00760.

Broadbent, Alex. 2019. Philosophy of Medicine. Oxford: Oxford University Press.

Buckner, Cameron. 2019. “Deep Learning: A Philosophical Introduction.” Philosophy Compass 14, no. 10, art. e12625.

———. 2021. “Black Boxes, or Unflattering Mirrors? Comparative Bias in the Science of Machine Behaviour.” British Journal for the Philosophy of Science. Advance online publication.

Burrell, Jenna. 2016. “How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms.” Big Data & Society 3, no. 1.

Creel, Kathleen A. 2020. “Transparency in Complex Computational Systems.” Philosophy of Science 87, no. 4: 568–589.

Deaton, Angus and Nancy Cartwright. 2018. “Understanding and Misunderstanding Randomized Controlled Trials.” Social Science & Medicine 210: 2–21.

De Fauw, Jeffrey, Joseph R. Ledsam, Bernardino Romera-Paredes, Stanislav Nikolov, Nenad Tomašev, Sam Blackwell, Harry Askham, et al. 2018. “Clinically Applicable Deep Learning for Diagnosis and Referral in Retinal Disease.” Nature Medicine 24, no. 9: 1342–1350.

DeGrave, Alex J., Joseph D. Janizek, and Su-In Lee. 2021. “AI for Radiographic COVID-19 Detection Selects Shortcuts over Signal.” Nature Machine Intelligence 3 no. 7: 610–619.

Durán, Juan Manuel and Karin Rolanda Jongsma. 2021. “Who Is Afraid of Black Box Algorithms? On the Epistemological and Ethical Basis of Trust in Medical AI.” Journal of Medical Ethics 47, no. 5: 329–335.

Ehrmann, Daniel E., Sara N. Gallant, Sujay Nagaraj, Sebastian D. Goodfellow, Danny Eytan, Anna Goldenberg, and Mjaye L. Mazwi. 2022. “Evaluating and Reducing Cognitive Load Should Be a Priority for Machine Learning in Healthcare.” Nature Medicine 28: 1331–1333.

Elgin, Catherine. 2007. “Understanding and the Facts.” Philosophical Studies 132, no. 1: 33–42.

Erasmus, Adrian, Tyler D.P. Brunet, and Eyal Fisher. 2021.“What Is Interpretability?” Philosophy & Technology 34, no. 4: 833–862.

Esteva, Andre, Brett Kuprel, Roberto A. Novoa, Justin Ko, Susan M. Swetter, Helen M. Blau, and Sebastian Thrun. 2017. “Dermatologist-Level Classification of Skin Cancer with Deep Neural Networks.” Nature 542, no. 7639: 115–118.

FDA (US Food and Drug Administration). 2022. “Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices.”

Finlayson, Samuel G., Adarsh Subbaswamy, Karandeep Singh, John Bowers, Annabel Kupke, Jonathan Zittrain, Isaac S. Kohane, and Suchi Saria. 2021. “The Clinician and Dataset Shift in Artificial Intelligence.” New England Journal of Medicine 385, no. 3: 283–286.

Freeman, Karoline, Julia Geppert, Chris Stinton, Daniel Todkill, Samantha Johnson, Aileen Clarke, and Sian Taylor-Phillips. 2021. “Use of Artificial Intelligence for Image Analysis in Breast Cancer Screening Programmes: Systematic Review of Test Accuracy.” British Medical Journal 374.

Fuller, Jonathan. 2021. “The myth and Fallacy of Simple Extrapolation in Medicine.” Synthese 198, no. 4: 2919–2939.

Gaube, Susanne, Harini Suresh, Martina Raue, Alexander Merritt, Seth J. Berkowitz, Eva Lermer, Joseph F. Coughlin, et al. 2021. “Do as AI Say: Susceptibility in Deployment of Clinical Decision-Aids.” Npj Digital Medicine 4, no. 31.

Geirhos, Robert, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A. Wichmann. 2020. “Shortcut Learning in Deep Neural Networks.” Nature Machine Intelligence 2, no. 11: 665–673.

Geirhos, Robert, Kantharaju Narayanappa, Benjamin Mitzkus, Tizian Thieringer, Matthias Bethge, Felix A. Wichmann, and Wieland Brendel. 2021. “Partial Success in Closing the Gap between Human and Machine Vision.” In Advances in Neural Information Processing Systems 34, edited by M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, 23885–23899.

Geirhos, Robert, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann, and Wieland Brendel. 2018. “ImageNet-Trained CNNs Are Biased towards Texture; Increasing Shape Bias Improves Accuracy and Robustness.” arXiv preprint:1811.12231.

Genin, Konstantin. 2018. “The Topology of Statistical Inquiry.” PhD diss., Carnegie Mellon University.

Genin, Konstantin and Thomas Grote. 2021. “Randomized Controlled Trials in Medical AI: A Methodological Critique.” Philosophy of Medicine 2, no. 1: 1–15.

Gilpin, Leilani. H., David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael A. Specter, and Lelana Kagal. 2018. “Explaining Explanations: An Overview of Interpretability of Machine Learning.” 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA): 80–89.

Goldman, Alvin and Bob Beddor. 2021. “Reliabilist Epistemology.” In Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta.

Grote, Thomas and Philipp Berens. 2022. “How Competitors Become Collaborators: Bridging the Gap(s) between Machine Learning Algorithms and Clinicians.” Bioethics 36, no. 2: 134–142.

Gulshan, Varun, Lily Peng, Marc Coram, Martin C. Stumpe, Derek Wu, Arunachalam Narayanaswamy, Subhashini Venugopalan, et al. 2016. “Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.” JAMA 316, no. 22: 2402–2410.

Günther, Mario and Atoosa Kasirzadeh. 2022. “Algorithmic and Human Decision Making: For a Double Standard of Transparency.” AI & Society 37: 375–381.

Hinton, Geoffrey E., Oriol Vinyals, and Jeff Dean. 2015. “Distilling the Knowledge in a Neural Network.” arXiv preprint: 1503.02531.

Horwick, Jeremy H. 2011. The Philosophy of Evidence-Based Medicine. London: John Wiley & Sons.

Ilanchezian, Indu, Dmitry Kobak, Hanna Faber, Focke Ziemssen, Philipp Berens, and Murat Seckin Ayhan. 2021. “Interpretable Gender Classification from Retinal Fundus Images Using BagNets.” In Medical Image Computing and Computer Assisted Intervention – MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III, edited by Marleen de Bruijne, Philippe C. Cattin, Stéphane Cotin, Nicolas Padoy, Stefanie Speidel, Yefeng Zheng, and Caroline Essert, 477–487. Berlin: Springer.

Ioannidis, John P.A. 2005. “Why Most Published Research Findings Are False.” PLOS Medicine 2, no. 8.

Karaca, Koray. 2021. “Values and Inductive Risk in Machine Learning Modelling: The Case of Binary Classification Models.” European Journal for Philosophy of Science 11, no. 4, art. 102.

Kiener, Maximilian. 2021. “Artificial Intelligence in Medicine and the Disclosure of Risks.” AI & Society 36, no. 3: 705–713.

Krishnan, Maya. 2020. “Against Interpretability: A Critical Examination of the Interpretability Problem in Machine Learning.” Philosophy & Technology 33, no. 3: 487–502.

Lapuschkin, Sebastian, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. 2019. “Unmasking Clever Hans Predictors and Assessing What Machines Really Learn.” Nature Communications 10, no. 1, art. 1096.

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep Learning.” Nature 521, no. 7553: 436–444.

Lipton, Zachary C. 2018. “The Mythos of Model Interpretability: In Machine Learning, the Concept of Interpretability Is Both Important and Slippery.” Queue 16, no. 3: 31–57.

Liu, Xiaoxuan, Ben Glocker, Melissa M. McCradden, Marzyeh Ghassemi, Denniston, Alastair K., and Lauren Oakden-Rayner. 2022. “The Medical Algorithmic Audit.” The Lancet Digital Health 4, no. 5: e384–e397.

London, Alex John. 2019. “Artificial Intelligence and Black-Box Medical Decisions: Accuracy versus Explainability.” Hastings Center Report 49, no. 1: 15–21.

Lundberg, Scott and Su-In Lee. 2017. “A Unified Approach to Interpreting Model Predictions.” In Advances in Neural Information Processing Systems 30, edited by I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, 4768–4777.

McDougall, Rosalind J. 2019. “Computer Knows Best? The Need for Value-Flexibility in Medical AI.” Journal of Medical Ethics 45, no. 3, 156–160.

McKinney, Scott Mayer, Marcin Sieniek, Varun Godbole, Jonathan Godwin, Natasha Antropova, Hutan Ashrafian, Trevor Back, et al. 2020. “International Evaluation of an AI System for Breast Cancer Screening.” Nature 577, no. 7788: 89–94.

Molnar, Christoph. 2020. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. Lulu. com.

Montavon, Grégoire, Wojciech Samek, and Klaus-Robert Müller. 2018. “Methods for Interpreting and Understanding Deep Neural Networks.” Digital Signal Processing 73: 1–15.

Mullainathan, Sendhil and Ziad Obermeyer. 2022. “Diagnosing Physician Error: A Machine Learning Approach to Low-Value Health Care.” Quarterly Journal of Economics 137, no. 2: 679–727.;h=repec:oup:qjecon:v:137:y:2022:i:2:p:679-727.

Nyrup, Rune and Diana Robinson. 2022. “Explanatory Pragmatism: A Context-Sensitive Framework for Explainable Medical AI.” Ethics and Information Technology 24, no. 1, art. 13.

Parker, Wendy S. 2020. “Model Evaluation: An Adequacy-for-Purpose View.” Philosophy of Science 87, no. 3: 457–477.

Post, Piet N., Hans de Beer, and Gordon H. Guyatt. 2013. “How to Generalize Efficacy Results of Randomized Trials: Recommendations Based on a Systematic Review of Possible Approaches.” Journal of Evaluation in Clinical Practice 19, no. 4: 638–643.

Poursabzi-Sangdeh, Forough, Daniel G. Goldstein, Jake M. Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2021. “Manipulating and Measuring Model Interpretability.” In CHI ’21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1–52. New York: Association for Computing Machinery.

Ratti, Emanuele and Mark Graves. 2022. “Explainable Machine Learning Practices: Opening Another Black Box for Reliable Medical AI.” AI and Ethics. Advance online publication.

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. 2016. “‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier.” In KDD ’16, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144. New York: Association for Computing Machinery.

Rudin, Cynthia. 2019. “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.” Nature Machine Intelligence 1, no. 5: 206–215.

Rudin, Cynthia, Chaofan Chen, Zhi Chen, Haiyang Huang, Lesia Semenova, and Chudi Zhong. 2022. “Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges.” Statistics Surveys 16: 1–85.

Sackett, David L. William M.C. Rosenberg. 1995. “The Need for Evidence-Based Medicine.” Journal of the Royal Society of Medicine 88, no. 11: 620–624.

Senn, Stephen. 2013. “Seven Myths of Randomisation in Clinical Trials.” Statistics in Medicine 32, no. 9: 1439–1450.

Shalev-Shwartz, Shai and Shai Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms. Cambridge: Cambridge University Press.

Slack, Dylan, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. “Fooling LIME and SHAP: Adversarial Attacks on Post Hoc Explanation Methods.” In AIES ’20: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 180–186. New York: Association for Computing Machinery.

Slack, Dylan, Sophie Hilgard, Himabindu Lakkaraju, and Sameer Singh. 2021a. “Counterfactual Explanations Can Be Manipulated.” Advances in Neural Information Processing Systems 34: 62–75.

Slack, Dylan, Sophie Hilgard, Sameer Singh, and Himabindu Lakkaraju. 2021b. “Reliable Post Hoc Explanations: Modeling Uncertainty in Explainability.” Advances in Neural Information Processing Systems 34.

Stegenga, Jacob. 2018. Medical Nihilism. Oxford: Oxford University Press.

Stokes, Dustin. 2021. “On Perceptual Expertise.” Mind & Language 36, no. 2: 241–263.

Tomašev, Nenad, Xavier Glorot, Jack W. Rae, Michal Zielinski, Harry Askham, Andre Saraiva, Anne Mottram, et al. 2019. “A Clinically Applicable Approach to Continuous Prediction of Future Acute Kidney Injury.” Nature 572, no. 7767: 116–119.

Tomašev, Nenad, Natalie Harris, Sebastien Baur, Anne Mottram, Xavier Glorot, Jack W. Rae, Michal Zielinski, et al. 2021. “Use of Deep Learning to Develop Continuous-Risk Models for Adverse Event Prediction from Electronic Health Records.” Nature Protocols 16, no. 6: 2765–2787.

Tschandl, Philipp, Christoph Rinner, Zoe Apalla, Giuseppe Argenziano, Noel Codella, Allan Halpern, Monika Janda, et al. 2020. “Human–Computer Collaboration for Skin Cancer Recognition.” Nature Medicine 26, no. 8: 1229–1234.

Varoquaux, Gaël and Veronika Cheplygina. 2022. “Machine Learning for Medical Imaging: Methodological Failures and Recommendations for the Future.” Npj Digital Medicine 5, no. 1: 1–8.

Wachter, Sandra, Brent Mittelstadt, and Chris Russell. 2018. “Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR.” Harvard Journal of Law & Technology 31, no. 2: 841–887.

Williamson, Jon. 2021. “Evidential Pluralism and Explainable AI.” The Reasoner 15, no. 6: 55–56.

Watson, David S. 2022. “Conceptual Challenges For Interpretable Machine Learning.” Synthese 200, no. 2, art. 65.

Watson, David S., Jenny Krutzinna, Ian N. Bruce, Christopher E.M. Griffiths, Iain B. McInnes, Michael R. Barnes, and Luciano Floridi. 2019. “Clinical Applications of Machine Learning Algorithms: Beyond the Black Box.” British Medical Journal 364, art. l886.

Yala, Adam, Peter G. Mikhael, Constance Lehman, Gigin Lin, Fredrik Strand, Yung-Liang Wan, Kevin Hughes, et al. 2022. “Optimizing Risk-Based Breast Cancer Screening Policies with Reinforcement Learning.” Nature Medicine 28, no. 1: 136–143.

Yim, Jason, Reena Chopra, Terry Spitz, Jim Winkens, Annette Obika, Christopher Kelly, Harry Askham, et al. 2020. “Predicting Conversion to Wet Age-Related Macular Degeneration Using Deep Learning.” Nature Medicine 26, no. 6: 892–899.

Zednik, Carlos. 2021. “Solving the Black Box Problem: A Normative Framework for Explainable Artificial Intelligence.” Philosophy & Technology 34, no. 2: 265–288.

Zednik, Carlos and Hannes Boelsen. 2022. “Scientific Exploration and Explainable Artificial Intelligence.” Minds and Machines 32, no. 1: 219–239.

Zerilli, John, Alistair Knott, James Maclaurin, and Colin Gavaghan. 2019. “Transparency in Algorithmic and Human Decision-Making: Is There a Double Standard?” Philosophy & Technology 32, no. 4: 661–683.

Zhang, Chiyuan, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2021. “Understanding Deep Learning (Still) Requires Rethinking Generalization.” Communications of the ACM 64, no. 3: 107–115.




How to Cite

Grote, T. (2023). The Allure of Simplicity: On Interpretable Machine Learning Models in Healthcare . Philosophy of Medicine, 4(1).




Funding data

  • Deutsche Forschungsgemeinschaft
    Grant numbers (BE5601/4-1; Cluster of Excellence “Machine Learning—New Perspectives for Science”, EXC 2064, project number 390727645).
  • Carl-Zeiss-Stiftung
    Grant numbers project "Certification and Foundations of Safe Machine Learning Systems in Healthcare"
  • Baden-Württemberg Stiftung
    Grant numbers AITE;(Artificial Intelligence, Trustworthiness and Explainability)