AAAI Publications, Second AAAI Conference on Human Computation and Crowdsourcing

Font Size: 
Crowdsourcing the Extraction of Data Practices from Privacy Policies
Florian Schaub, Travis D Breaux, Norman Sadeh

Last modified: 2014-09-05


Website and mobile application privacy policies are intended to describe the system’s data practices. However, they are often written in non-standard formats and contain ambiguities that make it difficult for users to read and comprehend these documents. We propose a crowdsourcing approach to extract data practices from privacy policies to provide more concise and useable privacy notices to users and support the analysis of stated data practices. To that end, we designed a hierarchical task workflow for crowdsourcing the extraction of data practices from privacy policies. We discuss our workflow design and report preliminary results.


privacy; privacy policies; web privacy; crowdsourcing; extraction; data practices


Ammar, W., Wilson, S., Sadeh, N., Smith, N.A. 2012. Automatic Categorization of Privacy Policies: A Pilot Study. Tech report CMU-ISR-12-114, Carnegie Mellon University.

André, P., Kittur, A., Dow, S. P. 2014. Crowd Synthesis: Extract- ing Categories and Clusters from Complex Data. Proc. CSCW ’14, ACM.

Breaux, T.D., Hibshi, H., Rao A. 2014. Eddy, A Formal Lan- guage for Specifying and Analyzing Data Flow Specifications for Conflicting Privacy Requirements. Req’ts Engr. J.

Breaux, T. D., and Schaub, F. 2014. Scaling Requirements Ex- traction to the Crowd: Experiments with Privacy Policies. Proc. Intl. Req’ts Engr. Conf. (RE '14). IEEE.

Cranor, L. F. 2012. Necessary but Not Sufficient: Standardized Mechanisms for Privacy Notice and Choice. J on Telecomm & High Tech. L., vol 10.

Ramanath, R., Liu, F., Sadeh, N., Smith, N. A. 2014. Unsuper- vised alignment of privacy policies using hidden Markov models. Proc. Association of Computational Linguistics (ACL ’14).

Sadeh, N., Acquisti, A., Breaux, T. D., Cranor, L. F., McDonald, A. M., Reidenberg, J. R., Smith, N. A., Liu, F., Russell, N. C., Schaub, F., Wilson, S. 2013. The Usable Privacy Policy Project. Tech. report CMU-ISR-13-119, Carnegie Mellon University.

Zimmeck, S. and Bellovin, S. M. 2014. Privee: An Architecture for Automatically Analyzing Web Privacy Policies. Proc. USE- NIX Security Symposium. USENIX Association.

Full Text: PDF