Article Text

Download PDFPDF
Bridging the implementation gap of machine learning in healthcare
  1. Martin G Seneviratne1,
  2. Nigam H Shah1,
  3. Larry Chu2
  1. 1 Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, California, USA
  2. 2 Anesthesia, Stanford University, Stanford, California, USA
  1. Correspondence to Dr Nigam H Shah, 1265 Welch Road, Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, CA 94305, USA; nigam{at}stanford.edu

Statistics from Altmetric.com

Applications of machine learning on clinical data are now attaining levels of performance that match or exceed human clinicians.1–3 Fields involving image interpretation—radiology, pathology and dermatology—have led the charge due to the power of convolutional neural networks, the existence of standard data formats and large data repositories. We have also seen powerful diagnostic and predictive algorithms built using a range of other data, including electronic health records (EHR), -omics, monitoring signals, insurance claims and patient-generated data.4 The looming extinction of doctors has captured the public imagination, with editorials such as ‘The AI Doctor Will See You Now’.5 The prevailing view among experts is more balanced: that doctors who use artificial intelligence (AI) will replace those who do not.6

Amid such inflated expectations, the elephant in the room is the implementation gap of machine learning in healthcare.7 8 Very few of these algorithms ever make it to the bedside; and even the most technology-literate academic medical centres are not routinely using AI in clinical workflows. A recent systematic review of deep learning applications using EHR data highlighted the need to focus on the last mile of implementation: ‘for direct clinical impact, deployment and automation of deep learning models must be considered’.9 The typical life-cycle of an algorithm remains: train on historical data, publish a good receiver-operator curve and then collect dust in the ‘model graveyard’.

This begs the question: if model performance is so …

View Full Text

Footnotes

  • MGS and NHS are joint first authors.

  • Twitter @martin_sen

  • Contributors MGS, NHS and LC all participated in the drafting of the manuscript. MGS and NHS are joint first authors.

  • Funding MGS was supported by the John Monash Scholarship.

  • Competing interests MGS is presently an employee of DeepMind Health. This paper was drafted prior to employment and represents personal views only.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.