Yang Zhou, Li Zheng, Xuerui Yang, Zhang Linxia, Shireesh Srivastava, Rong Jin, Christina Chan
Reconstructing gene networks from micro-array data can provide information on the mechanisms that govern cellular processes. Numerous studies have been devoted to addressing this problem. A popular method is to view the gene network as a Bayesian inference network, and to apply structure learning methods to determine the topology of the gene network. There are, however, several shortcomings with the Bayesian structure learning approach for reconstructing gene networks. They include high computational cost associated with analyzing a large number of genes and inefficiency in exploiting prior knowledge of co-regulation that could be derived from Gene Ontology (GO) information. In this paper, we present a knowledge driven matrix factorization (KMF) framework for reconstructing modular gene networks that addresses these shortcomings. In KMF, gene expression data is initially used to estimate the correlation matrix. The gene modules and the interactions among the modules are derived by factorizing the correlation matrix. The prior knowledge in GO is integrated into matrix factorization to help identify the gene modules. An alternating optimization algorithm is presented to efficiently find the solution. Experiments show that our algorithm performs significantly better in identifying gene modules than several state-of-the-art algorithms, and the interactions among the modules uncovered by our algorithm are proved to be biologically meaningful.
Subjects: 12. Machine Learning and Discovery; Please choose a second document classification
Submitted: Apr 9, 2008