Haiying Wang, Huiru Zheng, Francisco Azuaje
The ability to learn from data and to improve its performance through incremental learning makes self-adaptive neural networks (SANNs) a powerful tool to support knowledge discovery. However, the development of SANNs has traditionally focused on data domains that are assumed to be modeled by a Gaussian distribution. The analysis of data governed by other statistical models, such as the Poisson distribution, has received less attention from the data mining community. Based on special considerations of the statistical nature of data following a Poisson distribution, this paper introduces a SANN, Poisson-based Self-Organizing Tree Algorithm (PSOTA), which implements novel similarity matching criteria and neuron weight adaptation schemes. It was tested on synthetic and real world data (serial analysis of gene expression data). PSOTA-based data analysis supported the automated identification of more meaningful clusters. By visualizing the dendrograms generated by PSOTA, complex inter- and intra-cluster relationships encoded in the data were also highlighted and readily understood. This study indicate that, in comparison to the traditional Self-Organizing Tree Algorithm (SOTA), PSOTA offers significant improvements in pattern discovery and visualization in data modeled by the Poisson distribution, such as serial analysis of gene expression data.
Subjects: 14. Neural Networks; 12. Machine Learning and Discovery
Submitted: Oct 6, 2006