Towards Semantics-Enabled Distributed Cyberinfrastructure for Knowledge Acquisition

Vasant Honavar, Doina Caragea

We present a sufficient statistics based framework for learning predictive models from semantically disparate, distributed data. The proposed approach yelds provably exact algorithms (relative to their centralized counterparts) for learning classifiers from distributed data and lends itself to adaptation to settings where the data reside in databases that have disparate schema and data semantics. The resulting algorithms are being implemented as part of INDUS, an open source suite of software for knowledge acquisition from large distributed, semantically disparate data sources.

Subjects: 12. Machine Learning and Discovery; 10. Knowledge Acquisition

Submitted: Feb 12, 2008

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.