Salim Khan, William Gillis, Carl Schmidt, and Keith Decker
As genomic and proteomic data is collected from highthroughput methods on a daily basis, subcellular components are identified and their in vitro behavior is characterized. However, much less is known of their in vivo activity because of the complex subcellular milieu they operate within. A component’s milieu is determined by the biological pathways it participates in, and hence, the mechanisms by which it is regulated. We believe AI planning technology provides a modeling formalism for the task of biological pathway discovery, such that hypothetical pathways can be generated, queried and qualitatively simulated. The task of signal transduction pathway discovery is re-cast as a planning problem, one in which the initial and final states are known and cellular processes captured as abstract operators that modify the cellular environment. Thus, a valid plan that transforms the initial state into a goal state is a hypothetical pathway that prescribes the order of signaling events that must occur to effect the goal state. The planner is driven by data that is stored within a knowledge base and retrieved from heterogeneous sources (including gene expression, protein-protein interaction and literature mining) by a multi-agent information gathering system. We demonstrate the combined technology by translating the well-known EGF pathway into the planning formalism and deploying the Fast-Forward planner to reconstruct the pathway directly from the knowledge base.