Table 2

Results of the models on tasks 1–4 for threefold optimisation of the number of training epochs based on the rotated development sets using the frozen optimal model parameters from table 1. Train+development/test sample counts are displayed alongside the task. Testing is performed on the held-out test fold, each. The mean area under the curve of the receiver operating characteristics curve (AUC-ROC) and the unweighted average recall (UAR) are displayed. A 95% CI is also shown following ref 33 and the normal approximation method for AUC-ROC and UAR, respectively. Scores in bold indicate significant results with α=0.05 using a two-sample t-test for no difference in means between the baseline and CIdeR based on the SD from the 3-threefold cross-optimisation.

TASKCIdeRBaseline
AUCUARAUCUAR
1 (688/238) 0.827±0.0510.770±0.0530.697±0.0660.677±0.059
2 (146/28)*0.570±0.2160.535±0.1850.677±0.0590.583±0.183
3 (118/32)* 0.909±0.130 0.774±0.1450.559±0.2200.506±0.173
4 (684/350) 0.846±0.040 0.765±0.0440.721±0.0530.654±0.050
  • *It is questionable whether the normality assumption holds at these small sample sizes. The CI estimates should therefore be taken lightly.

  • CIdeR, COVID-19 Identification ResNet.