Learning Phenotypes and Dynamic Patient Representations via RNN Regularized Collective Non-Negative Tensor Factorization
Non-negative Tensor Factorization (NTF) has been shown effective to discover clinically relevant and interpretable phenotypes from Electronic Health Records (EHR). Existing NTF based computational phenotyping models aggregate data over the observation window, resulting in the learned phenotypes being mixtures of disease states appearing at different times. We argue that by separating the clinical events happening at different times in the input tensor, the temporal dynamics and the disease progression within the observation window could be modeled and the learned phenotypes will correspond to more specific disease states. Yet how to construct the tensor for data samples with different temporal lengths and properly capture the temporal relationship specific to each individual data sample remains an open challenge. In this paper, we propose a novel Collective Non-negative Tensor Factorization (CNTF) model where each patient is represented by a temporal tensor, and all of the temporal tensors are factorized collectively with the phenotype definitions being shared across all patients. The proposed CNTF model is also flexible to incorporate non-temporal data modality and RNN-based temporal regularization. We validate the proposed model using MIMIC-III dataset, and the empirical results show that the learned phenotypes are clinically interpretable. Moreover, the proposed CNTF model outperforms the state-of-the-art computational phenotyping models for the mortality prediction task.