Rohit Joshi, Xiaoli Li, Sreeram Ramachandaran, and Tze Yun Leong
Bayesian Networks and Influence Diagrams are effective methods for structuring clinical problems. Constructing a relevant structure without the numerical probabilities in itself is a challenging task. In addition, due to the rapid rate of innovations and new findings in the biomedical domain, constructing a relevant graphical model becomes even more challenging. Building a model structure from text with minimum intervention from domain experts and minimum training examples has always been a challenge for the researchers. In the biomedical domain, numerous advances have been made which may make this dream a possibility now. We are currently trying to build a general purpose system to automatically extract the model structure from scientific articles using a combination of ontological knowledge and data mining with natural language processing. This paper discusses the prototype system that we are working on. Previously, systems have used keyword features to extract knowledge from text. We, like Blake et al , argue that the choice of features used to represent a domain has a profound effect on the quality of model produced. Our system uses concepts and semantic types rather than keywords. We map complete sentences in the medical text to a conceptual level and a semantic level. We then, use Association Rule Mining (ARM) to extract relationships from text. Rules are then filtered and verified to improve precision of the obtained rules. Preliminary results applied to Colorectal Cancer medical domain are presented, which suggest the feasibility of our approach.