AAAI Publications, Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Font Size: 
SlimShot: Probabilistic Inference for Web-Scale Knowledge Bases
Eric Gribkoff, Dan Suciu

Last modified: 2016-03-29


Increasingly large Knowledge Bases are being created, by crawling the Web or other corpora of documents, and by extracting facts and relations using machine learning techniques. To manage the uncertainty in the data, these KBs rely on probabilistic engines based on Markov Logic Networks (MLN), for which probabilistic inference remains a major challenge. Today's state of the art systems reduce the task of inference to weighted model counting and use an MCMC algorithm wrapped around SampleSAT to generate approximately uniform samples. This approach offers no theoretical error guarantees and, as we show, suffers from poor performance in practice. In this paper we describe SlimShot (Scalable Lifted Inference and Monte Carlo Sampling Hybrid Optimization Technique), a probabilistic inference engine for Web-Scale knowledge bases. SlimShot converts the MLN to a tuple-independent probabilistic database, then uses a simple Monte Carlo-based inference, with three key enhancements: (1) it combines sampling with safe query evaluation, (2) it estimates a conditional probability by jointly computing the numerator and denominator, and (3) it adjusts the proposal distribution based on the sample cardinality. In combination, these three techniques allow us to give formal error guarantees, and we demonstrate empirically that SlimShot outperforms today's state of the art probabilistic inference engines used in knowledge bases.

Full Text: PDF