Table 1

Summary of the 63 included studies on artificial intelligence and opioid use that were published in academic journals, ordered alphabetically by the four domains to which AI was applied, including surveillance and monitoring, risk prediction, pain management and patient support

Study ID (country)(ref)	Study design	Study population and data source	Sample (n=)	AI technology	Application	Outcomes	Stage of development
Risk prediction
Ahn et al 2016 (Bulgaria).38	Cross-sectional	Amphetamine or heroin-dependent or polysubstance-dependent adults. Data collected from two 4-hour study sessions using a battery of self-reported and administered assessments.	222	Machine learning (elastic net)	To identify substance-specific behavioural markers for heroin and amphetamine dependence.	The machine-learning approach revealed substance-specific multivariate profiles that classified heroin and amphetamine dependence. Psychopathy was uniquely associated with heroin dependence.	Preliminary research
Anderson et al 2020 (USA).39	Prognostic	Military patients (males) undergoing a specific orthopaedic procedure. Data from the Military Health System Data Repository.	10 919	Logistic regression, random forest, Bayesian belief network and gradient boosting machine models	Risk prediction of prolonged opioid use after a specific orthopaedic procedure.	The gradient boosting machine can be used to understand factors contributing to opiate misuse after anterior cruciate ligament reconstruction. The most influential features with a positive association for prolonged opioid use are preoperative morphine equivalents, pharmacy ordering site locations, shorter deployment time and younger age.	Local implementation and undergoing external validation.
Ben-Ari et al 2017 (USA).40	Retrospective cohort	Male patients in the Veterans Affairs system who had TKA. Data from EHRs.	32 636	Natural language processing-based machine learning classifier	Assessment of the association of long-term opioid use on adverse outcomes after TKA.	The accuracy of the text classifier was 0.94 with an AUROC of 0.99. Long-term opioid use prior to TKA was associated with an increased risk of knee revision during the first year after TKA among predominantly male patients.	Preliminary research.
Boslett et al 2020 (USA).41	Retrospective cohort	People with an unclassified drug overdose recorded in death records. Data from the National Centre for Health Statistics Detailed Multiple Cause of Death records.	632 331	Random forest ensemble	Comparison of methodologies to predict the involvement of opioids in unclassified drug overdose deaths to estimate the number of fatal opioid overdoses.	Random forest models performed similarly to logistic regression. Using a superior prediction model, the study found that 71.8% of unclassified drug overdoses in 1999–2016 involved opioids, approximately 28% more than reported. There was geographic variation in undercounting of opioid overdoses.	Preliminary research.
Calcaterra et al 2018 (USA).42	Retrospective cohort	All patients discharged from Denver Health Medical Centre, an integrated safety net health system. Data from EHRs.	27 705	Random forest, least absolute shrinkage, and selection operator (lasso), stepwise logistic regression	Prediction of COT in hospitalised patients not on opioids before hospitalisation.	The multiple logistic regression model correctly predicted 79% of the COT patients and 78% of the no COT patients. Accuracy was 78%, and the AUROC was 0.86. Risk factors for COT included more than 10 mg of morphine equivalents prescribed per day during hospitalisation, two or more opioid prescriptions filled in the year preceding hospitalisation, past year receipt of non-analgesic pain medications and past year receipt of benzodiazepines.	External validation required.
Che et al 2017 (USA).43	Retrospective cohort	Patients who received at least one opioid prescription. Data from EHRs.	102 166	Deep feed-forward neural network, recurrent neural network, logistic regression, support vector machine Random forest	Classification of opioid users.	The deep learning models were able to achieve superior classification performance and identify useful feature indicators for opioid-dependent and long-term users. Several disorders diagnoses, such as ‘substance-related disorders’, ‘anxiety disorders’ and ‘other mental health disorders’ are all highly related to opioid dependence.	Model development planned.
Dong et al 2020 (USA).44	Retrospective cohort	Patients with at least one historic encounter before first opioid poisoned related diagnosis. Data from EHRs.	555 000	Deep neural networks Random forest, decision tree Logistic regression	Risk prediction of future opioid poisoning and the most important features for such predictions.	EHR-based prediction can achieve best recall with random forest method (85.7%), best precision with deep learning (precision). Top predictive feature with regards diagnosis is ‘sedative, hypnotic or anxiolytic dependence, continuous’. The top predictive factors in the health facts dataset are pulse and heart rate.	Model development planned.
Ellis et al 2019 (USA).45	Case–control	Patients diagnosed with substance misuse. Data from EHRs.	7797	Random forest	Prediction of substance dependence based on lab tests and vital signs	The top machine learning classifier using all features achieved a mean AUROC of ~92%. The study found opioid-dependent patients have significantly higher white blood cell (WBC) count and respiratory disturbances. Opioid-dependent patients are also commonly malnourished, which is characterised by low red cell distribution width, and blood albumin compared with controls.	Preliminary research
Green et al 2019 (USA).46	Retrospective cohort	Patients who had overdosed on an opioid medication. Data from EHRs.	977	NLP	Identification and classification of opioid‐related overdoses.	Code-based algorithms developed to detect opioid‐related overdoses and classify them according to heroin involvement performed well. The NLP‐enhanced algorithms for suicides/suicide attempts and abuse‐related overdoses perform significantly better than code‐based algorithms and are appropriate for use in settings that have data and capacity to use NLP.	Model development required.
Haller et al 2017 (USA).47	Retrospective cohort	Patients with chronic non-cancer pain. Data from EHRs.	3668	NLP	Prediction of opioid abuse by use of automated risk assessments.	Confirmed through manual review, the NLP algorithm had 96.1% sensitivity, 92.8% specificity and 92.6% positive predictive value in identifying opioid agreement violation. Patients classified as high risk were three times more likely to violate opioid agreements compared with those with low/moderate risk.	External validation required.
Han et al 2020 (USA).48	Cross-sectional	Adolescents. Self-reported data collected from a survey.	41 579	Neural networks, distributed, random forest, gradient boosting machine model	Prediction of opioid misuse	The overall rate of opioid misuse among adolescents was 3.7% (n=1521). Prediction performance was similar across the four models AUROC values range from 0.809 to 0.815. In terms of the area under the precision-recall curve, the distributed random forest showed the best performance in prediction (0.172).	Preliminary research.
Hastings et al 2020 (USA).49	Retrospective cohort	Patients prescribed opioid medication. Data from EHRs.	80 768	Regularised regression, neural network	Prediction of future opioid dependence, abuse or poisoning in advance of an initial opioid prescription.	All models achieve an AUC near 0.800, indicating they have strong predictive power but could still be improved. The two variables with the largest ORs (indicating increased risk) are related to crime: release from prison and an indicator for an arrest. Individuals released from prison in the prior year are estimated as 119% more likely to develop an adverse outcome if given an initial prescription, all else equal, and those with an arrest in the prior year are 76% more likely to do so.	Preliminary research.
Hylan et al 2015 (USA).50	Retrospective cohort	Patients with chronic non-cancer pain initiating opioid therapy. Data from EHRs.	2752	NLP	Prediction of risk for problem opioid use in a primary care setting.	The AUROC (c-statistic) for problem opioid use was 0.739. As predictive models for problem opioid use are only moderately accurate, at best, there is always a need for the clinician’s vigilance to ensure safe and appropriate opioid use for long-term management of chronic musculoskeletal pain.	Preliminary research.
Karhade et al 2019A (USA).51	Case–control	Patients undergoing anterior cervical discectomy and fusion for degenerative disorders. Data from EHRs.	2737	Random forest, stochastic gradient boosting, neural network, support vector machine, elastic net penalised, logistic regression	Prediction of sustained opioid prescription after anterior cervical discectomy and fusion	The stochastic gradient boosting algorithm achieved the best performance (c-statistic 0.81). Global explanations of the model demonstrated that preoperative opioid duration, antidepressant use, tobacco use and Medicaid insurance were the most important predictors of sustained postoperative opioid prescription.	External validation required.
Karhade et al 2019B (USA).52	Retrospective cohort	Patients undergoing total hip arthroplasty for osteoarthritis. Data from EHRs.	5507	Stochastic gradient boosting, random forest, support vector machine, neural network, elastic net penalised logistic regression	Prediction of sustained postoperative opioid prescriptions after total hip arthroplasty	The elastic net penalised logistic regression model achieved the best performance (c-statistic 0.77). 6.3% of patients had prolonged postoperative opioid prescriptions. The factors determined for prediction of prolonged postoperative opioid prescriptions were age, duration of opioid exposure, preoperative haemoglobin and preoperative medications (antidepressants, benzodiazepines, nonsteroidal anti-inflammatory drugs and beta-2-agonists).	External validation required.
Karhade et al 2019C (USA).53	Case–control	Patients undergoing surgery for lumbar disc herniation. Data from EHRs.	5413	Random forest, stochastic gradient boosting, neural network, support vector machine, elastic-net penalised logistic regression.	Prediction of prolonged opioid prescription after surgery for lumbar disc herniation.	7.7% of patients were identified, with sustained postoperative opioid prescription after surgery. The elastic-net penalised logistic regression model had the best overall performance (c-statistic 0.81). The three most important predictors were: instrumentation, duration of preoperative opioid prescription and comorbidity of depression.	Available online as open access.
Karhade et al 2020A (USA).54	Retrospective cohort	Opioid-naïve adults who underwent lumbar spine surgery. Data from EHRs.	8435	Random forest, stochastic gradient boosting, neural network, support vector machine, elastic-net penalised logistic regression.	Predication of prolonged opioid prescriptions in opioid-naïve lumbar spine patients.	4.3% of patients were found to have prolonged postoperative opioid prescriptions. The elastic-net penalised logistic regression achieved the best performance (c-statistic=0.70). The five most important factors for prolonged opioid prescriptions were use of instrumented spinal fusion, preoperative benzodiazepine use, preoperative antidepressant use, preoperative gabapentin use and uninsured status.	Available online as open access.
Katakam et al 2020B (USA).55	Retrospective cohort	Patients undergoing surgery for total knee replacement. Data from EHRs.	12 542	Random forest, stochastic gradient boosting, neural network, support vector machine, elastic-net penalised logistic regression.	Preoperative prediction of prolonged opioid prescriptions after total knee replacement.	The stochastic gradient boosting model had the best performance. Age, history of preoperative opioid use, marital status, diagnosis of diabetes and several preoperative medications were predictive of prolonged postoperative opioid prescriptions.	External validation required.
Lo-Ciganic et al 2019 (USA).56	Prognostic	Patients without cancer receiving one or more opioid prescription. Data from prescription drug and medical claims.	560 057	Multivariate logistic regression, least absolute shrinkage and selection operator–type regression, random forest, gradient boosting machine, deep neural network.	Prediction of opioid overdose risk	The deep neural network (c-statistic=0.91) and gradient boosting machine (c-statistic=0.90) algorithms outperformed the other methods for predicting opioid overdose. The deep neural network classified patients into low-risk (76.2% of the cohort), medium-risk (18.6% of the cohort) and high-risk (5.2% of the cohort) subgroups, with only 1 in 10 000 in the low-risk subgroup having an overdose episode. More than 90% of overdose episodes occurred in the high-risk and medium-risk subgroups.	Preliminary research.
Lo-Ciganic et al 2020A (USA).57	Prognostic	Patients without cancer receiving one or more opioid prescription. Data from prescription drug and medical claims.	361 527	Elastic net, random forests, gradient boosting machine, deep neural network	Prediction of incident of OUD diagnosis.	All approaches had similar prediction performances (c-statistic ranged from 0.874 to 0.882); elastic net required the fewest predictors. This algorithm was able to segment the population into different risk groups based on predicted risk scores, with 70% of the sample having minimal OUD risk, and half of the individuals with OUD captured in the top decile group.	Preliminary research.
McCann-Pineo et al 2020 (USA).58	Retrospective cohort	Patients ≥18 years who had an Emergency Department (ED) visit. Data from survey.	44 227	LASSO regularisation, elastic net regularisation, conditional inference random forest, gradient boosted machine, Naïve Bayes.	Prediction of opioid administration during an ED visit and prescribing on discharge.	The strongest predictors of ED opioid prescription were CT scan ordered, abdominal pain and back pain. Tooth pain and fracture injury diagnoses were the strongest predictors of a discharge opioid prescription.	Preliminary research.
Segal et al 2020 (USA).59	Retrospective cohort	Patients who had made medical insurance claims. Data from commercial claims database.	550 000	NLP, gradient boosting machine.	Prediction of early diagnosis of OUD.	The c-statistic for the model was 0.959. Significant differences between positive OUD- and negative OUD- controls were in the mean annual amount of opioid use days, number of overlaps in opioid prescriptions per year, mean annual opioid prescriptions and annual benzodiazepine and muscle relaxant prescriptions. The new algorithm offers a mean 14.4-month reduction in time to diagnosis of OUD.	Preliminary research.
Wadekar et al 2020 (USA).60	Retrospective cohort	Adults responding to the National Survey on Drug Use and Health (2016).	42 324	Random forest	Prediction for risk for opioid use disorder and identify interactions between various characteristics that increase this risk.	Random forest predicted adults at risk for OUD with the average AUROC over 0.89. Initiation of marijuana before 18 years emerged as the dominant predictor. Early marijuana initiation increased the risk if individuals were between 18 and 34 years, or had incomes less than $49, 000, or were of Hispanic and white heritage, or were on probation, or lived in neighbourhoods with easy access to drugs.	Preliminary research.
Surveillance and monitoring
Afshar et al 2019 (USA).61	Prognostic	Patients with opioid misuse who had an inpatient hospitalisation. Data from EHRs.	6224	NLP, topic modelling (Latent Dirichlet Allocation).	Identification of subtypes of patients with opioid misuse and examining the distinctions between the subtypes.	Distinct subtypes were identified after examining data and applying methods in artificial intelligence. Class 1 patients had high hospital utilisation with known opioid-related conditions (36.5%); class 2 included patients with illicit use, low socioeconomic status and psychoses (12.8%); class 3 contained patients with alcohol use disorders with complications (39.2%); and class 4 consisted of those with low hospital utilisation and incidental opioid misuse (11.5%).	External validation required.
Anwar et al 2020 (USA).62	Retrospective observational	Opioid-related Twitter posts relating to prescription opioids, heroin and synthetic.	10 000 posts	NLP	Investigation of the extent to which the content of opioid-related tweets corresponds with the triphasic nature (shift from prescription opioids for pain to heroin and then to synthetic opioids) of the opioid crisis and correlates with opioid overdose deaths.	Tweets were classified as relating to prescription opioids, heroin and synthetic opioids using NLP. The pattern of opioid-related Twitter posts resembled the triphasic nature of the opioid crisis. Tweets mentioning heroin and synthetic opioids were significantly associated with opioid overdose deaths.	Preliminary research.
Badger et al 2019 (USA).63	Retrospective cohort	Patients with an International Classification of Diseases-9 and International Classification of Diseases-10 code related to opioid overdose and poisoning. Data from EHRs.	278	NLP, Naïve Bayes, support vector machine, LASSO logistic regression, random forest.	Development of machine learning models for classifying the severity of opioid overdose events.	Random forest models using features derived from a common data model and free text can be effective for classifying opioid overdose events. Key word features extracted using NLP such as ‘Narcan’ and ‘Endotracheal Tube’ are important for classifying overdose event severity.	External validation required.
Black et al 2020 (USA).64	Retrospective cohort	People who died of a drug poisoning. Data from surveillance system programme.	4008	NLP	Assessment of changes in mortality rates in ER/LA opioid analgesics after the implementation of the Risk Evaluation and Mitigation Strategy (REMS).	The NLP model correctly identified all active pharmaceutical ingredients with 100% sensitivity and specificity relative to what was printed on the death certificate. The population-adjusted mortality rate of ER/LA opioid analgesics has decreased after the implementation of the REMS in three states.	Preliminary research.
Cai et al 2020 (USA).65	Retrospective infoveillance	Indiana geolocated tweets filtered for geocoded messages in the immediate pre and post period of the HIV outbreak.	5112 posts	Unsupervised machine learning approach using NLP called the Biterm Topic Model.	Identification and characterisation of tweets related to the 2015 Indiana HIV outbreak.	The Biterm Topic Model identified 1350 tweets thought to be relevant to the outbreak and then confirmed 358 tweets using human annotation. The most prevalent themes identified were tweets related to self-reported abuse of illicit and prescription drugs, OUD, self-reported HIV status and public sentiment regarding the outbreak. Geospatial analysis found that these messages clustered in population dense areas outside of the outbreak.	Preliminary research.
Carrell et al 2015 (USA).66	Retrospective cohort	Patients receiving chronic opioid therapy, a mixed-model health plan. Data from EHRs.	22 142	NLP	Identification of problem opioid use from clinical notes of patients receiving chronic opioid therapy.	Traditional diagnostic codes for problem opioid use were found for 2240 (10.1%) patients. NLP-assisted manual review identified an additional 728 (3.1%) patients with evidence of clinically diagnosed problem opioid use in clinical notes.	Model development required.
Chary et al 2017 (USA).67	Retrospective observational	Tweets that contained at least one keyword related to prescription opioid use.	3 611 528 posts	NLP	Correlation of geographic variation of social media posts mentioning prescription opioid misuse with government estimates of misuse.	Natural language processing can be used to analyse social media to provide insights for syndromic toxicosurveillance. Mention of misuse of prescription opioids (MUPO) on Twitter correlate strongly with state-by-state National Surveys on Drug Usage and Health (NSDUH) estimates of MUPO. The strongest correlation occurred between data from Twitter and NSDUH data from those aged 18–25 years.	Model development required.
Cuomo et al 2020 (USA).68	Retrospective observational	Tweets related to opioid, heroin/injection and HIV behaviour.	1350 posts	Unsupervised machine learning approach using NLP (Biterm Topic Model).	Identification and characterisation of HIV outbreak triggered by opioid abuse and transition to injection drug use.	Prevalent themes identified were tweets related to self-reported abuse of illicit and prescription drugs, opioid use disorder, self-reported HIV status and public sentiment regarding the outbreak. Geospatial analysis found messages clustered in population dense areas outside of the outbreak.	Preliminary research.
Fodeh et al 2021 (USA).69	Retrospective observational	Tweets containing key opioid-related keywords.	1677	NLP, recurrent neural networks, random forest, support vector machines.	Categorisation of twitter chatter based on the motive of opioid misuse.	A recurrent neural network classifier (XLNet) achieved the best performance. The model identified three groups of tweets: tweets of users with no OM, tweets of users with pain-related OM and tweets of users with recreational-related OM. Clinically, individuals who misuse opioids because of pain have different motivations and patterns of use.	Preliminary research.
Hazelhurst et al 2019 (USA).70	Retrospective cohort	Patients who had overdosed on opioid medication. Data from EHRs.	305	NLP	Identification and classification of opioid‐related overdoses.	The method performed well in identifying overdose, intentional overdose and involvement of opioids (excluding heroin) and heroin. The method performed poorly at identifying adverse drug reactions and overdose due to patient error and fairly at identifying substance abuse in opioid‐related unintentional overdose.	Preliminary research.
Jha and Singh 2019 (USA).71	Retrospective observational	Recreational drug use reddits, and drug addiction recovery reddits.	170 097	NLP and machine learning (SMARTS software)	Identification of individuals open to addiction recovery interventions using SMARTS. SMARTS is a public, open source, web-based application for addiction information extraction, analysis and modelling.	SMARTS generalises well across the different kinds of addiction posts and can identify individuals open to recovery interventions for intoxicants, such as opioids with an accuracy of 96%. The SMARTS web server and source code are available at: http://haddock9.sfsu.edu/.	Available online as open access.
Jungquist et al 2019 (USA).72	Retrospective observational	Patients undergoing back, neck, hip or knee surgery. Data from enrolled patients at a community hospital.	60	Support vector machine	Development of machine learning to aid in earlier detection of respiratory depression.	The model provides a high detection rate (>0.9) when the detection horizon is short. However, even for longer time horizons (eg, 10 min before the actual event), the model that uses all three electronic measurements is able to correctly predict nearly 80% of opioid-induced respiratory depression (ORID) events. Nurses can use electronic monitoring data to identify patients experiencing OIRD to influence opioid-sparing pain management.	Preliminary research.
Kalyanam et al 2017 (USA).73	Retrospective observational	Tweets filtered for commonly abused prescription opioid drugs.	11 million posts	Biterm topic model	Identification of macro non-medical use of prescription opioid medication.	The cluster purity for each drug was up to three times better than that of a random set of tweets. Twitter content was associated with a high degree of discussion (approximately 80%) about polydrug abuse involving multiple types of substances.	Preliminary research.
Khemani et al 2017 (USA).74	Retrospective cohort	Patients presenting with abdominal pain to the emergency department. Data from EHRs.	16 121	NLP	Characterisation of opioid use, constipation and risk factors for surgical diagnoses among non-cancer patients presenting with acute abdominal pain (AAP).	Approximately 19% of adults presenting with AAP were opioid users; constipation is almost three times as likely in opioid users compared with non-opioid users presenting with AAP. Age and neutrophil count independently predicted increased risk, and chronic opioid use decreased risk of surgical diagnosis.	Preliminary research.
Li et al 2019 (USA).75	Retrospective observational	Instagram posts based on opioid keywords.	12 857	Recurrent neural network, random forest, decision tree, support vector machine	Identification of illegal internet drug dealing.	1228 drug dealer posts comprising 267 unique users were detected. The deep learning model reaching 95% on F1 score and performing better than the other three models. By removing the hashtags in the text, the model had better performance.	Model development required.
Lingeman et al 2017 (USA).76	Retrospective cohort	Primary care outpatients taking a prescribed opioid. Data from EHRs.	112	NLP Support vector machine	Surveillance of drug-related aberrant behaviour.	The model could differentiate clinical encounter notes that contain opioid-related aberrant behaviour from those that do not with relatively high accuracy (81%). Mentions of illicit drug use and patient anxiety were strong predictors of documented aberrant behaviour.	Model development planned.
Mackey et al 2017 (USA).77	Retrospective observational	Tweets filtered for prescription opioid keywords.	619 937 posts	Biterm topic model	Identification of the marketing of illegal online sales of controlled substances.	The biterm topic model enabled identification of 1778 (0.003%) containing content associated with illicit online drug sales. These tweets represent a potential patient safety hazard and substance abuse risk.	Preliminary research.
Mackey et al 2018 (USA).78	Retrospective observational	Tweets using common opioid keywords.	213 041	Biterm topic model	Detection and reporting illicit online pharmacy selling of controlled substances.	Using the biterm topic model, 0.32% (692/213 041) tweets were identified as being associated with illegal online marketing and sale of prescription opioids. After removing duplicates and dead links, we identified 34 unique ‘live’ tweets, with 44% directing consumers to illicit online pharmacies, 32% linked to individual drug sellers and 21% used by marketing affiliates. In addition to offering the ‘no prescription’ sale of opioids, many of these vendors also sold other controlled substances and illicit drugs.	Prototype for potential scale-up developed.
Masters et al 2018 (USA).79	Case–control	Patients who received COT. Data from EHRs.	11 253	NLP	Associated healthcare costs of COT patients with POU.	COT patients with NLP-identified, manually validated POU experienced significantly higher costs than COT patients without POU in the first year following their index date. The greatest difference in costs was observed around the time of identification of POU. The difference was driven by large differences in resource utilisation in the 30 days following clinician labelling of POU.	Preliminary research.
Mojtabai et al 2019 (USA).80	Retrospective cohort	Adult participants using opioids in the past year. Data were self-reported from a national survey on drug use.	30 813	Boosted regression	Assessment of the prevalence and correlates of self-reported misuse of prescribed opioids.	In boosted regression analysis, misuse of prescription opioids without a prescription, misuse of prescribed benzodiazepines, other substance use disorders, illegal activities and psychological distress were the most influential factors associated with prescribed opioid misuse.	Preliminary research.
Palmer et al 2015 (USA).81	Retrospective cohort	Patients receiving COT. Data from EHRs.	22 142	NLP	Prevalence of POU.	Agreement between the NLP methods and ICD-9 coding was moderate (kappa 0.61). 9.4% of COT patients had current problem opioid use, with higher rates observed among young COT patients, patients who sustained opioid use for more than four quarters and patients who received higher opioid doses.	Preliminary research.
Panlilio et al 2020 (USA).82	Retrospective cohort	Methadone-maintained participants undergoing contingency-management treatment. Retrospective data from three randomised clinical trials.	309	Unsupervised machine learning	Patterns of opioid and cocaine use in contingency management, methadone-treated participants.	Four clusters of use patterns were identified, which can be described as opioid use, cocaine use, dual use (opioid and cocaine) and partial/complete abstinence. Contingency management increased membership in clusters with lower levels of drug use and fewer symptoms of substance use disorder.	Preliminary research.
Paulose et al 2018 (India).83	Retrospective observational	Tweets containing the keyword fentanyl.	4604	NLP	Identification of fentanyl misuse using social media posts.	The sentiment analysis algorithm labelled 610 (13.25 %) tweets as positive. Crisis, dead, death, die, dose, drug, heroin, kill, lethal, opioid, overdose and police were some of the words frequently associated with fentanyl. There was a high correlation and association of fentanyl with these negative terms that demonstrated fentanyl abuse in the real world.	Preliminary research.
Prieto et al 2020 (USA).84	Retrospective cohort	Patients treated for opioid misuse (OM) by paramedics. Data from Denver Health paramedic trip reports.	54 359	Random forest, k-nearest neighbours, support vector machines, L1-regularised logistic regression.	Identification of potential OM from paramedic documentation.	L1-regularised logistic regression was the highest performing algorithm (AUROC=0.94) in identifying OM. Among trip reports with reviewer agreement, 77.79% (907/1166) were considered to include information consistent with OM.	Preliminary research.
Sarker et al 2019A (USA).85	Retrospective observational	Tweets that mentioned prescription and illicit opioids.	9006	Support vector machines, random forest, deep convolutional neural network	Monitoring of population-level opioid abuse.	Deep convolutional neural networks marginally outperformed support vector machines and random forests, with an accuracy of 70.4%. Geolocation data were able to identify the origins of tweets at the state level; it may be possible to further narrow down to the county or city level in the future.	Model development planned.
Sarker et al 2019B (USA).86	Retrospective observational	Tweets that mentioned prescription and illicit opioids.	9006	NLP, naïve Bayes decision tree, k-nearest neighbours, random forest, support vector machine, deep convolutional neural network.	Development of an automatic text-processing pipeline for geospatial and temporal analysis of opioid-mentioning social media chat.	Ensemble of four classifiers (Ensemble_1) producing the best F1 score (0.726). 19.4% of tweets were related to abuse, 22.2% were related to information, 53.6% were unrelated and 4.7% were not in English. Yearly rates of abuse-indicating social media post showed statistically significant correlation with county-level opioid-related overdose death rates for 3 years.	Model development planned.
Sharma et al 2020 (USA).87	Retrospective cohort	Adult inpatients at a hospital and tertiary academic centre. Data from EHRs.	1000	Logistic regression, convolutional neural network	Development of a PHI free model for text classification of opioid misuse.	The top performing models with AUROCs >0.90 included concept unique identifier codes as inputs to a convolutional neural network, max pooling network and logistic regression model. The model demonstrates good test characteristics for an opioid misuse computable phenotype that is void of any PHI and performs similarly to models that use PHI.	Available via Github.
Singh et al 2019 (USA).88	Case series	Patients who were brought to the ED while undergoing naloxone treatment following an opioid overdose. Data from a wearable biosensor.	11	Random forest, adaboost, support vector machine, logistic regression.	Development of a wearable biosensor to detect collaborative non-adherence detection during opioid abuse surveillance.	The best performing algorithm was the Random Forest with an AUROC=0.93. Overall, we achieved an average detection accuracy of 90.96% when the collaborator was one of the patients in the dataset, and 86.78% when the collaborator was from a set of users unknown to the classifier.	Model development planned.
Sinha et al 2017 (USA).89	Retrospective cohort	Patients treated with chronic pain medication. Data from EHRs.	212 343	NLP	Demographic patterns of opioid-dependent patients in New York.	The trends of opioid dependence among the clinic population indicate that the prevalence is more in a certain section of the population. The predominance is among the non-Hispanic, white population in 19–38 year olds.	Preliminary research.
Yao et al 2019 (USA).90	Case–control	Posts of suicidality among opioid users on Reddit.	45 459 posts	Convolutional neural network, logistic regression, random forest, support vector machines	Detection of suicidality among opioid users on Reddit	Best classifier was convolutional neural network, which obtained an F1 score of 96.6% and AUC of 0.932. When predicting out-of-sample data for posts containing both suicidal ideation and signs of opioid addiction, neural network classifiers produced more false positives and traditional methods produced more false negatives.	Preliminary research.
Pain management
Goyal et al 2020 (USA).91	Cross-sectional	Children (<18 years) who presented to the ED with a long bone fracture. Data from EHRs of 7 paediatric EDs.	8533	NLP	To identify if minority children with long-bone fractures are less likely to receive analgesics; receive opioid analgesics and achieve pain reduction.	NLP identified patients with radiology reports indicating long bone fractures. Minority children are more likely to receive analgesics and achieve two-point reduction in pain; however, they are less likely to receive opioids and achieve optimal pain reduction.	Preliminary research.
Gram et al 2017 (Germany)92	Retrospective cohort	Patients admitted to hospital for a total hip replacement. Data were clinical parameters recorded from patients the day prior to surgery.	81	Support vector machine	Preoperative identification of risk factors for analgesic inefficacy of postoperative opioid treatment.	The accuracy of the model was 65%. Severity of the presurgical chronic pain condition was a factor associated with postsurgical insufficiency of analgesic treatment. It was possible to predict analgesic efficacy based on the preoperative EEG recordings using machine learning, with similar accuracy to the chronic pain grade.	Preliminary research.
Graves et al 2018 (USA).93	Retrospective observational	Patients and caregivers’ reviews that contained brand, generic or colloquial name of an opioid. Data from all Yelp reviews of hospital.	836	NLP	Characterisation of patients and caregivers’ reviews about pain management and opioids.	Themes identified in natural language processing of opioid reviews with five-star and one-star ratings reflected pain management and opioid-related themes similar to those identified by manual coding. Yelp reviews describing experiences with pain management and opioids had lower ratings compared with other reviews. Negative descriptions of pain management and opioid-related experiences were more commonly described than positive experiences, and the number of themes they reflected was more diverse.	Preliminary research.
Gudin et al 2020 (USA).94	Cohort	Chronic pain patients. Data were self-reported in questionnaires.	127	Hybrid combining multi-objective optimisation and support vector regression.	Reducing opioid prescriptions by identifying responders on topical analgesic treatment.	The model can predict the outcomes with accuracy of AUROC between 73.8 and 87.2%, and this allowed their incorporation in a decision support system for the selection of the treatment of chronic pain patients.	External validation required.
Lee et al 2021 (USA).95	Retrospective cohort	Patients undergoing total joint replacement surgery (TJR). Data from EHRs and a patient survey conducted at a non-profit community hospital.	285	Random forest, XGBoost, logistic regression, support vector machine, k-nearest neighbours (K-NN), neural network models.	Identification of patients who may need less or more opioids after being discharged from TJR surgeries.	XGBoost and Random Forest models achieve the best test accuracy of 83% (AUROC 0.72;0.65). A machine learning classification model was developed that can identify patients expected to use less opioids and to detect opioid overusers within 2 weeks after undergoing total joint replacement surgeries.	Preliminary research.
Nair et al 2020 (USA).96	Retrospective cohort	Adults (≥18 years) undergoing ambulatory surgery. Data from institution’s information management system data warehouse.	13 700	Multinomial regression, naïve Bayesian, neural network, random forest, extreme gradient boosting.	Prediction of postoperative opioid requirements for pain management of ambulatory surgery patients.	The best performing model, the random forest, showed that the lower opioid requirements are predicted with better accuracy (89%) as compared with higher opioid requirements (43%). The type of procedure, medical history and procedure duration were the top features contributing to model predictions. Overall, the contribution of patient and procedure features towards model predictions were 65% and 35%, respectively.	External validation required.
Pantano et al 2020 (Italy).97	Retrospective cohort	Adults diagnosed with cancer with stable background pain in the last week. Data from the Italian Oncologic Pain Survey.	4016	Unsupervised machine learning	Identification of clinical features for breakthrough cancer pain (BTcP) and differential opioid response.	The algorithm identified 12 distinct BTcP clusters. Optimal BTcP opioids-to-basal pain opioids ratios differed across the clusters, ranging from 15% to 50%. The optimal dose of BTcP opioids depended on the dose of basal opioids.	Available online as open access.
Parthipan et al 2019 (USA).98	Retrospective cohort	Patients receiving surgery at a tertiary care academic medical centre with symptoms of depression. Data from EHRs.	430	NLP, elastic net regularised regression	Postoperative pain management in depressed patients.	The NLP algorithm identified depression with an F1 score of 0.95. Patients receiving selective serotonin reuptake inhibitors (SSRIs) and opioid prodrug had significantly worse pain control at discharge, 3-week and 8-week follow-up. Preoperative pain, surgery type and opioid tolerance were the strongest predictors of pain control. The study results imply direct acting opioids (eg, oxycodone or morphine) may be better choices for depressed patients on SSRIs for pain management.	Preliminary research.
Patient support
Epstein et al 2020 (USA).99	Retrospective cohort	Participants seeking treatment for OUD at a treatment research clinic. Data were from self-reported questionnaires, physical examinations and psychological testing.	189	Random forest	Individual patient-level prediction of stress and drug craving 90 min in the future with passively collected Global Positioning System data.	Models achieved overall accuracy of 0.93 by the end of 16 weeks of tailoring. This was driven mostly by correct predictions of absence. For predictions of presence, ‘believability’ (positive predictive value) usually peaked in the high 0.70 s towards the end of the 16 weeks. When target events are comparatively subtle, like stress or drug craving, accurate detection or prediction probably needs effortful input from users, not passive monitoring alone.	Model development planned.
Scherzer et al 2020 (USA).100	Pilot study using a prospective cohort design and qualitative interview.	Patients receiving buprenorphine treatment for OUD at an outpatient clinic. Data collected through the mobile app and interviews.	40	NLP	Development of mobile peer support for patients with OUD.	The Marigold App will allow patients to access a tailored support group 24/7 and is augmented with AI tools capable of understanding the emotional sentiment in messages, automatically ‘flagging’ critical or clinically relevant content.	Model development planned.

AI, artificial intelligence; AUROC, area under the receiver operating characteristic curve; COT, chronic opioid therapy; EHRs, electronic health records; ER, extended-release; LA, long-acting; LASSO, Least Absolute Shrinkage and Selection Operator; NLP, natural language processing; OM, opioid misuse; OUD, opioid use disorder; PHI, protected health information; POU, problem opioid use; TKA, total knee arthroplasty.