RT Journal Article SR Electronic T1 Machine-learning algorithm to non-invasively detect diabetes and pre-diabetes from electrocardiogram JF BMJ Innovations JO BMJ Innov FD All India Institute of Medical Sciences SP 32 OP 42 DO 10.1136/bmjinnov-2021-000759 VO 9 IS 1 A1 Anoop R Kulkarni A1 Ashwini A Patel A1 Kanchan V Pipal A1 Sujeet G Jaiswal A1 Manisha T Jaisinghani A1 Vidya Thulkar A1 Lumbini Gajbhiye A1 Preeti Gondane A1 Archana B Patel A1 Manju Mamtani A1 Hemant Kulkarni YR 2023 UL http://innovations.bmj.com/content/9/1/32.abstract AB Objectives Early detection is of crucial importance for prevention of type 2 diabetes and pre-diabetes. Diagnosis of these conditions relies on the oral glucose tolerance test and haemoglobin A1c estimation which are invasive and challenging for large-scale screening. We aimed to combine the non-invasive nature of ECG with the power of machine learning to detect diabetes and pre-diabetes.Methods Data for this study come from Diabetes in Sindhi Families in Nagpur study of ethnically endogenous Sindhi population from central India. Final dataset included clinical data from 1262 individuals and 10 461 time-aligned heartbeats recorded digitally. The dataset was split into a training set, a validation set and independent test set (8892, 523 and 1046 beats, respectively). The ECG recordings were processed with median filtering, band-pass filtering and standard scaling. Minority oversampling was undertaken to balance the training dataset before initiation of training. Extreme gradient boosting (XGBoost) was used to train the classifier that used the signal-processed ECG as input and predicted the membership to ‘no diabetes’, pre-diabetes or type 2 diabetes classes (defined according to American Diabetes Association criteria).Results Prevalence of type 2 diabetes and pre-diabetes was ~30% and ~14%, respectively. Training was smooth and quick (convergence achieved within 40 epochs). In the independent test set, the DiaBeats algorithm predicted the classes with 97.1% precision, 96.2% recall, 96.8% accuracy and 96.6% F1 score. The calibrated model had a low calibration error (0.06). The feature importance maps indicated that leads III, augmented Vector Left (aVL), V4, V5 and V6 were most contributory to the classification performance. The predictions matched the clinical expectations based on the biological mechanisms of cardiac involvement in diabetes.Conclusions Machine-learning-based DiaBeats algorithm using ECG signal data accurately predicted diabetes-related classes. This algorithm can help in early detection of diabetes and pre-diabetes after robust validation in external datasets.Data are available on reasonable request. The data are confidential and not publicly available. The codes and notebooks are available on reasonable request to the authors.