Show Abstract
Prior research shows that only a tiny percentage of users actually read the online privacy policies they implicitly agree to while using a website. Prior research also suggests that users ignore privacy policies because these policies are lengthy and, on average, require two years of college education to comprehend. We propose a novel technique that tackles this problem by automatically extracting summaries of online privacy policies. We use data mining models to analyze the text of privacy policies and answer ten basic questions concerning the privacy and security of user data, what information is gathered from them, and how this information is used. In order to train the data mining models, we thoroughly study privacy policies of 400 companies (considering 10% of all listings on NYSE, Nasdaq, and AMEX stock markets) across industries. Our free Chrome browser extension, PrivacyCheck, utilizes the data mining models to summarize any HTML page that contains a privacy policy. PrivacyCheck stands out from currently available counterparts because it is readily applicable on any online privacy policy. Cross validation results show that PrivacyCheck summaries are accurate 40% to 73% of the time. Over 400 independent Chrome users are currently using PrivacyCheck.
Access Publication: PrivacyCheck- Automatic Summarization of Privacy Policies Using Data Mining.pdf