Table 1

Summary of AI approaches against COVID-19 pandemic

AuthorsClinical problem/study goalNature of input dataAI approach usedModel performance/resultsEffect on outcomes/discussions
Outbreak detection and digital contact tracing
 Edo-Osagie et al 11 Using Twitter data to deliver signals for syndromic surveillance.Twitter data in different seasonality (2015–2017).Supervised algorithms (Naive Bayes, decision trees, logistic regression, support vector machines and multilayer perceptron (MLP) neural networks) versus semisupervised learning used to combine information from labelled and unlabelled data.Semisupervised learning achieved an accuracy of 95.5% with F1 Score of 0.910. For supervised algorithms, MLP provided the best performance, an accuracy of 95.5% with F1 Score of 0.93.Semisupervised classification techniques enable authors to use more of the Twitter data collected while only doing very minimal labelling. This approach allowed authors to use 8000 previously unlabeled tweets before demonstrating deterioration in performance. Thus, the model can analyse data more efficiently.
 Bogoch II et al 18 In early 2020, outbreak of pneumonia of unknown aetiology in Wuhan, China.Travel data generated from IATA to quantify passenger volumes originating from the international airport in Wuhan, China.Natural language processing; proprietary.Report of Infectious Disease Vulnerability Index (IDVI) scores for countries receiving significant numbers of travellers from Wuhan.Countries with largest number of passengers from Wuhan, China, appear to have high IDVI scores. The signal alarmed public health of possible outbreak.
 Choi et al 21 Large outbreak of MERS in Korea caused public fear, affecting economy and civil life.Public news media and commentaries in Korea from May to July 2015.Natural language processing with generative probabilistic model for text mining. Translated and expanded emotion lexicon used to reflect public emotions (sorrow, anger, fear and hate).Report of lethal MERS cases strongly affected public fears.An ML-based computational method for monitoring the public’s emotional response to an outbreak of MERS. Analysis may help governments alleviate unnecessary fears of public in case of future outbreaks.
Forecasting on outbreaks
 Ribeiro et al 32 Forecasting 3 and 6 days ahead COVID-19 cumulative patients in Brazil.Daily information about cases from Brazilian State Health Officers from initial outbreak to April 2020.Comparing predictive capacity of machine learning (ML) regression and statistical models. Include a cubist regression, random forest (RF), ridge regression, support vector regression (SVR) and stacking-ensemble learning.SVR and stacking ensemble reach better performance. Models achieved errors in a range of 1.02%–5.63% and 0.95%–6.90% in 3 and 6 days ahead.ML models may help predict COVID-19 cumulative patients in Brazil. Model may not be generalisable to other countries.
Detection of COVID-19 on medical imaging
 Minaee et al 46 Transfer learning can help investigators overcome the limited sized dataset.Dataset of 5000 CXR from publicly available datasets. Positive COVID-19 (84 cases in training and 100 cases in testing sets).Application of transfer learning on a subset of CXR used to train four pretrained CNNs (ResNet18, ResNet50, SqueezeNet and DenseNet-121).In validated dataset, most of networks achieved a sensitivity rate of 98% (±3%), while having a specificity rate of around 90%.In an early phase of pandemic, there was limited availability of COVID-19 images to train deep learning model. Transfer learning is useful for pretrained CNNs and achieved high sensitivity and specificity in diagnosis of COVID-19 from CXR.
 Webhe et al 47 Using AI algorithm for detecting
COVID-19 on chest radiographs.
5853 patients in a dataset from Northwestern Memorial Healthcare System.
Training data with RT-PCR confirmed cases (COVID-19 positive–1142 cases in training set and 324 cases in testing set).
DeepCOVID-XR is a weighted ensemble of deep neural networks. Preprocessed images are then fed into six previously validated CNN architectures. The final binary prediction (positive or negative) was a weighted average of predictions of these individual CNNs.In validated dataset, model accuracy was 82% (AUC, 0.88), as compared with consensus opinion of five radiologists (accuracy, 81%; AUC, 0.85).The model can be used as an automated tool to rapidly flag patients with suspicious chest imaging findings for isolation and further testing and to mitigate unnecessary exposure.
Prognostication
 Yan et al 50 ML-based model devised to identify the most discriminative biomarkers of patient mortality.Blood samples and medical records of 485 patients from the region of Wuhan, China, from January to February 2020.Supervised XGBoost classifier.Serum LDH, lymphocyte and hs-CRP demonstrated an AUC score for training sets of 97.84%±0.37% and validation sets of 95.06%±2.21%.Simple triage tool may be used to identify high-risk patients and allocate healthcare resources. Limitations are retrospective cohort and short observation period.
 Singh et al 51 The proprietary predictive model identified subgroups of patients with COVID-19 at high and low risk for adverse outcomes.369 patients with COVID-19 from the University of Michigan
(Ann Arbor) between March and May 2020 from the ED, outpatient clinics and outside hospital transfers.
The Epic Deterioration Index (EDI). The score ranges from 0 to 100, where higher numbers indicate a higher risk of developing a composite adverse outcome; proprietary.The model showed a fair discrimination value (AUC of 0.76 (95% CI 0.68 to 0.84) to predict the probability of hospitalised patients requiring intensive care.The EDI can be used to identify high-risk patients who may benefit from higher-level care and another limited subset of low-risk patients who may be cared for safely in lower-acuity settings. Limitations are retrospective cohort and single institution dataset.
 Arvind et al 53 The model provided a quick and accurate method of triaging patients at risk for respiratory failure or ventilator support.4087 patients with COVID-19 admitted to hospitals in New York City from February to April 2020.A supervised ML prediction classification to predict intubation 72 hours from the end of the 24 hours sampling window; RF classifier.The ML algorithm outperformed the ROX Index, demonstrating an AUC of 0.84 for the model and 0.64 for the ROX Index.The RF model was performed with a similar AUC across all ages except for patients<40 years due to rarity of patients below 40 years of age in training set. The major limitation of this model is requiring a 24 hours sampling window to generate a prediction.
Drug and vaccine development
 Hu et al 58 SARS-CoV-2 appears to have eight viral proteins which can be used as potential targets for drug repurposing.Amino acid sequences were extracted from NCBI. The virus-specific dataset from the GHDDI.Multitask deep learning model. The model predicts possible binding between commercially drugs and protein target.Abacavir and darunavir showed high binding affinity with multiple proteins of SARS-CoV-2.Only darunavir has an ongoing clinical trial of against COVID-19 infection in China. This article is in preprint format and has not passed a peer-reviewed process.
 Stebbing et al 62 ARDS from COVID-19 is characterised by an overexpression of inflammatory response. AP2-associated protein kinase 1 (AAK-1) inhibitor expressed both antiviral and anti-inflammatory.There are 378 AAK-1 inhibitors. 47 have been approved for medical use and 6 inhibited AAK-1 with high affinity.Monte Carlo tree search to discover knowledge graphs and identify AAK-1 inhibitors.Baricitinib is an AAK-1-binding drug which is also a janus kinase inhibitor, another regulator of endocytosis.A randomised control trial, a combination of baricitinib and remdesivir was shown to accelerate clinical improvement in patients with COVID-19 non-invasive ventilation and high-flow oxygen.
 Ong et al 69 AI helps scientists better understand protein involved in SARS-CoV-2 and search for potential targets for vaccine development.The SARS-CoV-2 sequence was obtained from NCBI. All proteins of six known human coronavirus strains were extracted from Uniprot proteomes consortium.Vaxign-ML is a supervised ML (eXtreme Gradient Boosting) designed to predict the protegenicity score of all SARS-CoV-2 isolate Wuhan-Hu-1.This model identified six proteins, including spike (S) protein and five non-structural proteins. These protein candidates were predicted to be adhesins crucial to viral adherence and host invasion.Vaxign-ML predicted that S protein had a high protective antigenicity score. For non-structural protein, multidomain nsp3 protein has the second-highest protective antigenicity. Authors proposed a development of a cocktail vaccine strategy.
  • AI, Artificial Intelligence; ARDS, acute respiratory distress syndrome; AUC, area under the curve; CNN, convolutional neural network; CXR, chest X-ray; GHDDI, Global Health Drug Discovery Institute; hs-CRP, high-sensitivity C reactive protein; IATA, International Air Transport Association; LDH, Lactate Dehydrogenase; MERS, Middle East respiratory syndrome; NCBI, National Center for Biotechnology Information; RT-PCR, real-time polymerase chain reaction.