Predicting Chemical Carcinogenesis in Rodents with Artificial Neural Networks and Symbolic Rules Extracted from Trained Networks

Brian A. Stone, Dennis Bahler

We train artificial neural networks to predict the results of long-term rodent bioassays by using data collected by the National Toxicology Program (NTP). The data set consists of salmonella mutagenicity assay results, subchronic pathology data, information on route, strain, and sex/species, physical chemical parameters, and structural alerts for 744 individual experiments. First, an automated method was devised to reduce the set of over 2800 possible attributes of these experiments to the 74 attributes which can be shown to be most relevant to this prediction task. Second, using these attributes a trained neural network model has been generated that has a cross-validated accuracy on unseen data of 89.23%. Third, a list of 22 M-of-N rules was extracted which are readable by humans and which explain the knowledge learned by the trained artificial neural network. Furthermore, the cross-validated accuracy of the rule set is within 2.5% of the full network model. These results contribute to the ongoing process of evaluating and interpreting the data collected from chemical toxicity studies.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.