Estimating the Causal Effect from Partially Observed Time Series
Many real-world systems involve interacting time series. The ability to detect causal dependencies between system components from observed time series of their outputs is essential for understanding system behavior. The quantification of causal influences between time series is based on the definition of some causality measure. Partial Canonical Correlation Analysis (Partial CCA) and its extensions are examples of methods used for robustly estimating the causal relationships between two multidimensional time series even when the time series are short. These methods assume that the input data are complete and have no missing values. However, real-world data often contain missing values. It is therefore crucial to estimate the causality measure robustly even when the input time series is incomplete. Treating this problem as a semi-supervised learning problem, we propose a novel semi-supervised extension of probabilistic Partial CCA called semi-Bayesian Partial CCA. Our method exploits the information in samples with missing values to prevent the overfitting of parameter estimation even when there are few complete samples. Experiments based on synthesized and real data demonstrate the ability of the proposed method to estimate causal relationships more correctly than existing methods when the data contain missing values, the dimensionality is large, and the number of samples is small.