PT - JOURNAL ARTICLE AU - Kulkarni, Anoop R AU - Patel, Ashwini A AU - Pipal, Kanchan V AU - Jaiswal, Sujeet G AU - Jaisinghani, Manisha T AU - Thulkar, Vidya AU - Gajbhiye, Lumbini AU - Gondane, Preeti AU - Patel, Archana B AU - Mamtani, Manju AU - Kulkarni, Hemant TI - Machine-learning algorithm to non-invasively detect diabetes and pre-diabetes from electrocardiogram AID - 10.1136/bmjinnov-2021-000759 DP - 2023 Jan 01 TA - BMJ Innovations PG - 32--42 VI - 9 IP - 1 4099 - http://innovations.bmj.com/content/9/1/32.short 4100 - http://innovations.bmj.com/content/9/1/32.full SO - BMJ Innov2023 Jan 01; 9 AB - Objectives Early detection is of crucial importance for prevention of type 2 diabetes and pre-diabetes. Diagnosis of these conditions relies on the oral glucose tolerance test and haemoglobin A1c estimation which are invasive and challenging for large-scale screening. We aimed to combine the non-invasive nature of ECG with the power of machine learning to detect diabetes and pre-diabetes.Methods Data for this study come from Diabetes in Sindhi Families in Nagpur study of ethnically endogenous Sindhi population from central India. Final dataset included clinical data from 1262 individuals and 10 461 time-aligned heartbeats recorded digitally. The dataset was split into a training set, a validation set and independent test set (8892, 523 and 1046 beats, respectively). The ECG recordings were processed with median filtering, band-pass filtering and standard scaling. Minority oversampling was undertaken to balance the training dataset before initiation of training. Extreme gradient boosting (XGBoost) was used to train the classifier that used the signal-processed ECG as input and predicted the membership to ‘no diabetes’, pre-diabetes or type 2 diabetes classes (defined according to American Diabetes Association criteria).Results Prevalence of type 2 diabetes and pre-diabetes was ~30% and ~14%, respectively. Training was smooth and quick (convergence achieved within 40 epochs). In the independent test set, the DiaBeats algorithm predicted the classes with 97.1% precision, 96.2% recall, 96.8% accuracy and 96.6% F1 score. The calibrated model had a low calibration error (0.06). The feature importance maps indicated that leads III, augmented Vector Left (aVL), V4, V5 and V6 were most contributory to the classification performance. The predictions matched the clinical expectations based on the biological mechanisms of cardiac involvement in diabetes.Conclusions Machine-learning-based DiaBeats algorithm using ECG signal data accurately predicted diabetes-related classes. This algorithm can help in early detection of diabetes and pre-diabetes after robust validation in external datasets.Data are available on reasonable request. The data are confidential and not publicly available. The codes and notebooks are available on reasonable request to the authors.