Inhospital fall incidence is a critical indicator of healthcare outcome. Predictive models for fall incidents could facilitate optimal resource planning and allocation for healthcare providers. In this paper, we proposed a tensor factorisationbased framework to capture the latent features for fall incidents prediction over time. Experiments with realworld data from local hospitals in Hong Kong demonstrated that the proposed method could predict the fall incidents reasonably well (with an area under the curve score around 0.9). As compared with the baseline time series models, the proposed tensor based models were able to successfully identify highrisk locations without records of fall incidents during the past few months.
The incidence of falls is a commonly used indicator of healthcare outcome. Injuries related to falls are the most common causes of accidental death for individuals over the age of 65 years in the USA and many developed countries.
Researchers have categorised risk factors for inpatient falls into patientrelated factors (such as age and cognitive function) and treatmentrelated factors (such as admission department functions).
There are a number of data mining studies focusing on predicting the fall incidence of different cohorts. Hill
Because of the small sample size of longitudinal data and the complexity of the data of hospital operations and individuals’ health conditions, it is very difficult to identify representative data and proper methods for reliable fall prediction. In fact, existing studies may not be able to achieve better performance than a simple clinical judgement of risk of falls.
The department and hospital of patients are related to the falls at different locations because of the characteristics of patients and healthcare services. For example, patients in
Different from previous research, our study explored a novel tensor factorisationbased approach to predicting fall incidents across different departments and locations. In particular, the advantage of tensor factorisationbased models is the capability to capture the inherent relations between patients, locations and other information without explicitly incorporating patient characteristics.
In this research, we proposed a fourthorder tensor factorisationbased method to capture the latent relations between the four dimensions of attributes in the fall incidents data for reliable fall prediction over time. The record of fall incidents maintained by hospitals contains important information of the identity and characteristics of patients, the time of the incidents, the locations and departments of falls. This locationbased information of inhospital fall incidents has not been fully used for fall prediction. In particular, we evaluated the model using the fall in seven public hospitals in Hong Kong, from January 2014 to September 2014. The data contains the following information: (A) time and date of the fall incidents, (B) the location of falls (eg, bedside and toilet), (C) department (eg, surgery and medicine) and (D) hospital.
In this research, we used the fall incidents data from seven public hospitals in Hong Kong from 1 January 2014 to 30 September 2014. The data contains occurrence date and time, department and specific location (eg, bedside and toilet) of each fall incident. The identityrelated information of patients was excluded from the dataset.
One incident in the dataset (hypothetical data).
Tensors are multidimensional arrays describing the linear relations between objects. Tensors provide a natural mathematical framework for representing and solving problems in a wide range of areas. Tensor factorisation is the higher order extension of matrix factorisation, which is able to capture the latent patterns in multiway datasets.
CP decomposition is a highly interpretable factorisation that can be used to address the temporal prediction problems. In addition, CP is unique under very mild assumptions,
To illustrate tensor factorisation, we present the decompositions of matrix, thirdorder tensor and fourthorder tensor in
An illustration of the decomposition of (A) matrix, (B) the thirdorder tensor and (C) the fourth order tensor.
Therefore, a fourthorder tensor can be expressed as follows,
where
With a proper estimation of these latent feature vectors
SMA is a useful forecasting method for predicting the time series, particularly when there is no observed seasonality.
EWMA is a method using a weighted average of the past observations.
where
In this study, we developed models to incorporate the CP factorisation results with SMA and EWMA methods to predict the future fall incidence. Because we focused on locationspecific fall prediction, plus we do not have information of patients’ identity, we did not include personal information of patients in our models. Instead, we aimed to model the temporal patterns of fall incidence at specific locations. As discussed in the introduction section, the department and hospital that patients belong to are related to the falls at different locations because such information could imply the different characteristics of patients and healthcare services. Therefore, the locationspecific data used in this method includes location, department and hospital. The multiway relations between locations, departments and hospitals could be naturally represented as a thirdorder tensor as shown in
Tensor representation of fall incidents data.
Then, the temporal patterns could be represented using a fourth dimension. In this case, we can define a tensor
where
Illustration of the fourthorder tensor.
As introduced in earlier in this section, the fourthorder tensor could be factorised as the summation of
And each entry in the original tensor is approximated as below
CP decomposition has the advantage of interpretability to extract the latent factors to capture the clustering information of certain hospitals, departments, locations and, particularly, the temporal information in the latent feature vector
Illustration of temporal prediction based on fourthorder tensors.
The values in the tensor are the number of fall incidents at each location during a time period. So, we used the commonly used Poisson distribution to model the count data.
After we factorised the tensor, we applied an SMAbased heuristic approach proposed in Dunlavy
In addition, we also proposed another model using EWMA instead of SMA. The prediction of the temporal dimension could be calculated recursively in the selected sliding window.
The risk score is set to be the predicted number of fall incidents for corresponding location. We can rank the score of each location to identify those with high risk of fall incidents.
In order to validate the effectiveness of the proposed tensorbased method, we performed two experiments with the 9month inhospital fall incident data. In the first experiment, we used a 3month sliding window (N=3) for forecast: we used the proposed method to predict the fall incidents in a month using the data of last 3 months. In the second experiment, we used a 5month sliding window (N=5) to predict the fall incidents in a month using the data of last 5 months. The reason we chose these two sliding windows is as follows: (A) if the sliding window was too narrow (eg, N=1 or N=2), we could not get sufficient data to forecast. If the sliding window is too broad (eg, N=7 or N=8), we could not get sufficient data to test the performance. (B) We could test the sensitivity of the proposed model using two sliding windows.
We implemented the proposed method by using MATLAB with the tensor_toolbox for CPAPR decompositionbased models,
To evaluate the capability of the proposed models in predicting the risk of fall incident of specific locations, we conducted two sets of experiments. First, we used the models to predict whether there would be fall incidents (true or false). Second, we used the models to predict the actual number of fall incidents. The first experiment is very practical for clinical decision making, as healthcare providers usually care about which locations should be taken care of. The second experiment, though still practical, is less conclusive as there are still a lot of uncertainties not covered by the dataset. In addition, the experiment results demonstrated some expected results such as the bedside being the location with the highrisk for fall incidents.
In the first experiment, we adopted the receiver operating characteristic (ROC) curve and the area under the curve (AUC) to evaluate the performance. ROC and AUC are widely used performance measures on continuous or ordinal scales.
ROC curves and AUC score for fall prediction (n=3). We used the data in the last 3 months (T1, T2 and T3) to predict the fall incidence of a specific month T. AUC, area under the curve; CP, CANDECOMP/PARAFAC; CPAPR, CP Alternating Poisson Regression; EWMA, Exponentially Weighted Moving Average; NMTF, Nonnegative Multiple Tensor Factorisation; ROC, receiver operating characteristic; SMA, simple moving average.
ROC curves and AUC score for fall prediction (n=5). We used the data in the last 5 months (T1, T2, T3, T4 and T5) to predict the fall incidence of a specific month T. AUC, area under the curve; CP, CANDECOMP/PARAFAC; CPAPR, CP Alternating Poisson Regression; EWMA, Exponentially Weighted Moving Average; NMTF, Nonnegative Multiple Tensor Factorisation; ROC, receiver operating characteristic; SMA, simple moving average.
The average AUC score is shown in
Average AUC of fall incidents prediction
# of historical periods  n=3  n=5 
SMA  0.835  0.900 
EWMA  0.835  0.900 
CPAPRSMA (rank=10)  0.883  0.925 
CPAPRSMA (rank=20) 
 0.917 
CPAPRSMA (rank=30)  0.875  0.929 
CPAPREWMA (rank=10)  0.883  0.923 
CPAPREWMA (rank=20)  0.892  0.916 
CPAPREWMA (rank=30)  0.875  0.928 
NMTFSMA (rank=10)  0.889 

NMTFSMA (rank=20) 
 0.932 
NMTFSMA (rank=30)  0.888  0.932 
NMTFEWMA (rank=10)  0.880  0.915 
NMTFEWMA (rank=20)  0.884  0.916 
NMTFEWMA (rank=30)  0.880  0.916 
AUC, area under the curve; CP, CANDECOMP/PARAFAC; CPAPR, CP Alternating Poisson Regression; EWMA, Exponentially Weighted Moving Average; NMTF, Nonnegative Multiple Tensor Factorisation; SMA, simple moving average.
The ROC curve provides the visualisation for the selection of the threshold for positive outcomes (occurrence of fall incidents). From experiments with both sliding windows, we observed similar results:
If a high threshold is used for prediction, both TPR and FPR are very low for all models. This is due to the fact that with a higher threshold, only those locations where fall incidents occurred in the past few months would be selected.
If a low threshold is used for prediction, the TPR is very high with high FPR for all models. This is not acceptable with too many false predictions.
If a proper threshold in the middle is used for prediction, the performance of the proposed tensorbased models (both CPAPR and NMTF) are better than baseline models (SMA and EWMA). The results are expected because pure time series models are not able to predict fall incidents at the ‘safe’ locations with no fall record in the past few months. Unlike SMA and EWMA, the proposed methods are able to predict such incidents through exploring the structure of other dimensions.
From the comparison of ROC curves, we found that the proposed tensorbased models were able to achieve better performance than the baseline time series models in identifying highrisk locations, especially those without a record of fall incidents in the last few months.
We adopted the commonly used rootmeansquare error (RMSE) metric to evaluate the performance of proposed models in predicting the actual number of fall incidents, as shown in
RMSE for fall prediction
# of historical periods  n=3  n=5 
SMA  0.211  0.200 
EWMA  0.212  0.197 
CPAPRSMA (rank=10)  0.207  0.196 
CPAPRSMA (rank=20)  0.207  0.196 
CPAPRSMA (rank=30)  0.211  0.197 
CPAPREWMA (rank=10)  0.217  0.198 
CPAPREWMA (rank=20)  0.212  0.203 
CPAPREWMA (rank=30)  0.217  0.207 
NMTFSMA (rank=10)  0.282  0.350 
NMTFSMA (rank=20)  0.286  0.351 
NMTFSMA (rank=30)  0.286  0.350 
NMTFEWMA (rank=10)  0.219  0.198 
NMTFEWMA (rank=20)  0.220  0.217 
NMTFEWMA (rank=30)  0.219  0.217 
CP, CANDECOMP/PARAFAC; CPAPR, CP Alternating Poisson Regression; EWMA, Exponentially Weighted Moving Average; NMTF, Nonnegative Multiple Tensor Factorisation; RMSE, rootmeansquare error; SMA, simple moving average.
We proposed a tensorbased framework to exploit the multidimensional structure for temporal prediction of fall incidents in hospitals. We developed a set of tensorbased machine learning models to predict the occurrence of fall incidents. After evaluating the performance of the proposed models in two sets of experiments, we draw the conclusion that tensorbased models are useful tools to identify the risk of locations. However, the advantage of using tensorbased models to predict the actual number of fall incidents at specific locations is not significant.
There are several limitations of this research. First due to the highly sensitive nature of medicalrelated data, it is very difficult for us to obtain other datasets for crossvalidation. The performance of the proposed tensorbased models was only evaluated using one dataset in this study. It is important to identify public benchmark data to further demonstrate the efficacy of the proposed models. Second, we only investigated the risk of fall incidents at specific locations (part of treatmentrelated factors), instead of individual patients (patientrelated factors). Our proposed models have potential to be applied in individualised fall risk assessment for patients. In our ongoing research, we plan to work with nurses and hospital authorities to extract patientlevel demographic information including anonymous identifier, so that we could predict the risk of falls for each patient.
The proposed fall prediction models could generate datadriven insights of patients’ fall incidence and could inform more effective and timely fall prevention programmes. This is of particular importance for Hong Kong because of its ageing population. It is also applicable for other ageing societies (like Japan). In our followup research, we plan to work with nurses and health authorities to develop an automatic early warning system to predict fall incidents at different locations and for individual patients.
The authors would like to thank editor and anonymous reviewers for their valuable and constructive comments.
HW and QZ designed the model and performed the experiments. ZSYW, AK and HYS collected the data. HW and ZSYW performed data analysis. All authors contributed to the writing of the paper.
HW and QZ are supported by the National Natural Science Foundation of China (NSFC) under grants 71402157 and 71672163 and in part by the ThemeBased Research Scheme of the Research Grants Council of Hong Kong under Grant T32102/14N.
None declared.
Not required.
Research committee of City University of Hong Kong.
Not commissioned; externally peer reviewed.
The raw hospital data are prohibited to share. The authors are happy to share the codes and hypothetical dummy data used in this research. Please email qingpeng.zhang@cityu.edu.hk.