Why the Data Train Needs Semantic Rails

  • Krzysztof Janowicz University of California, Santa Barbara
  • Frank van Harmelen Vrije Universiteit Amsterdam
  • James A. Hendler Rensselaer Polytechnic Institute
  • Pascal Hitzler Wright State University

Abstract

While catchphrases such as big data, smart data, data-intensive science, or smart dust highlight different aspects, they share a common theme: Namely, a shift towards a data-centric perspective in which the synthesis and analysis of data at an ever-increasing spatial, temporal, and thematic resolution promises new insights, while, at the same time, reducing the need for strong domain theories as starting points. In terms of the envisioned methodologies, those catchphrases tend to emphasize the role of predictive analytics, that is, statistical techniques including data mining and machine learning, as well as supercomputing. Interestingly, however, while this perspective takes the availability of data as a given, it does not answer the question how one would discover the required data in today’s chaotic information universe, how one would understand which datasets can be meaningfully integrated, and how to communicate the results to humans and machines alike. The semantic web addresses these questions. In the following, we argue why the data train needs semantic rails. We point out that making sense of data and gaining new insights works best if inductive and deductive techniques go hand-in-hand instead of competing over the prerogative of interpretation.
Published
2015-03-25
Section
Articles