Center Project Reports

A Study of Web Privacy Policies Across Industries

Author(s): Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on Apr 13, 2018
Download Report as PDF

Today, more than ever, companies collect their customers’ Personally Identifiable Information (PII)

over the Internet. The alarming rate of PII misuse drives the need for improving companies’ privacy

practices. We thoroughly study privacy policies of 600 companies (10% of all listings on NYSE, Nasdaq,

and AMEX stock markets) across industries and investigate ten different privacy pertinent factors in

them. The study reveals interesting trends: for example, more than 30% of the companies still lack

privacy policies, and the rest tend to collect users’ information but claim to use it only for the intended

purpose. Furthermore, almost one out of every two companies provides the collected information to law

enforcement without asking for a warrant or subpoena. We found that the majority of the companies do

not collect children’s PII, one out of every three companies let users correct their PII but do not allow

complete deletion, and the majority post new policies online and expect the user to check the privacy

policy frequently. The findings of this study can help companies improve their privacy policies, enable

lawmakers to create better regulations and evaluate their effectiveness, and finally educate users with

respect to the current state of privacy practices in an industry.

2017 ITAP Report

Author(s): The Center for Identity
Published on Apr 20, 2017
Download Report as PDF

The Identity Threat Assessment and Prediction (ITAP) model provides unique, research-based insights into the habits and methods of identity threats, and to the various factors associated with higher levels of risk for PII compromise and abuse. ITAP uncovers the identity attributes most vulnerable to theft, assesses their importance, and determines the personally identifiable information (PII) most frequently targeted by thieves and fraudsters.

An Empirical Study of the Level Of Agreement Between Social Media Users' Perceived and Actual Privacy Settings

Author(s): Randolph G Bias
Published on Nov 1, 2016
Download Report as PDF

Motivated by popular press and research literature assertions about social media applications’ intentional or unintentional obfuscation of their privacy settings, we intended to investigate empirically the level of match between users’ actual and perceived privacy settings.  In our study 1, a crowd-sourced survey asked 700 people about their use of five social media applications (Facebook, Twitter, Google+, Instagram, and Pinterest).  Respondents claimed to affect their privacy settings on most of these “occasionally.”  Except for in Pinterest, the privacy settings for which people tended not to change, respondents were confident they knew where the privacy settings were (between 76% and 87% saying they were “confident” or “strongly confident”), and confident that their own settings matched their intentions (with between 68% and 81% saying they were “confident” or “strongly confident”). 

PrivacyCheck: Automatic Summarization of Privacy Policies Using Data Mining

Author(s): Razieh Nokhbeh Zaeem, Rachel German, K. Suzanne Barber
Published on Aug 14, 2016
Download Report as PDF

Research shows that only a tiny percentage of users actually read the online privacy policies we all implicitly agree to when using a website. It also suggests that users ignore privacy policies because they are lengthy and, on average, require two years of college education to comprehend. We propose a novel technique that tackles this problem by automatically extracting graphical summaries of online privacy policies. We use data mining models to analyze the text of privacy policies and answer ten basic questions concerning the privacy and security of user data, what information is gathered from them, and how this information is used.

In order to train the data mining models, we thoroughly study privacy policies of 400 companies (7% of all listings on NYSE, Nasdaq, and AMEX stock markets) across industries. Our free Chrome browser extension,PrivacyCheck, utilizes the data mining models to summarize any HTML page that contains a privacy policy. PrivacyCheck stands out from currently available counterparts because it is readily applicable on any online privacy policy. Experimental results show that PrivacyCheck summaries are accurate 60% of the time. Over 350 independent Chrome users are currently using PrivacyCheck.

Risk Kit: Highlighting Vulnerable Identity Assets for Specific Age Groups

Author(s): Razieh Nokhbeh Zaeem, Monisha Manoharan, K. Suzanne Barber
Published on Aug 14, 2016
Download Report as PDF

Identity theft is perhaps the defining crime of the information age. Identity theft threatens various demographics, but some age groups, e.g., senior citizens, are particularly vulnerable. In this paper, we study how identity theft puts different personally identifiable information (PII) assets at risk of exposure, and how this risk changes throughout one’s lifecycle. We categorize PII assets, introducing a fourth novel category, measure their exposure risk using the Identity Theft Assessment and Prediction (ITAP) repository of over 3,000 identity theft cases, and track the risk change throughout an individual’s lifecycle. We introduce the concept of PII Balance SheetsTM, and finally, we present a free publicly available Android app that demonstrates our research results. This app not only educates individuals and highlights their vulnerable identity assets, but also is useful when they make the decision of whether or not to share their personally identifiable information.

Keystroke Analytics for Non-Invasive Diagnosis of Neurodegenerative Disease

Author(s): Andrew Ellington, Tim Riedel, Dan Winkler, Emily Knight
Published on Aug 14, 2016
Download Report as PDF

We sought to use the temporal dynamics of keyboarding during natural computer typing as an indicator of identity and health status. We first developed novel keystroke logging software in Python. We then analyzed the hold times (time between keydown and keyup for each key) and flight times (time between subsequent key presses) for two healthy individuals and found that 1) hold times and flight times differ significantly (p < 0.001) between these individuals, and 2) hold times for an individual are consistent across different times of day and different days. We then acquired typing data from patients with Parkinson’s disease (PD) (n=16) and from elderly controls (n=15) who typed for 15 minutes each, copying the same passage of text. None of the features we tested correlated with PD motor symptom severity.  This study represents a proof-of-principle that keyboarding has promise for both identity and diagnostic purposes.

A Computational Movement Analysis Framework for Exploring Anonymity in Human Mobility Trajectories

Author(s): Jennifer Miller
Published on Aug 14, 2016
Download Report as PDF

The unicity study has particularly important implications for privacy and the increasing availability of ‘anonymized’ trajectory datasets. This is one of the first studies to explore unicity and anonymity with higher resolution GPS data and it should be troubling how unique a set of two location points can be. Decreasing the spatial and temporal resolution reduces the unicity, but five points with x,y coordinates at the coarsest resolution tested here were still uniquely associated with a single trajectory more than 60% of the time. 

Economic Machine Learning for Fraud Detection

Author(s): Maytal Saar-Tesechansky
Published on Aug 14, 2016
Download Report as PDF

Our results demonstrate that evaluating the expected improvement in performance yield an ability to select generally good acquisitions in a cost-effective manner. Several policies yield comparable performance. These results suggest that our policies are able to identify acquisition costs that yield labeling quality to produced the desired improvement in performance.

Chaotic Hybrid Encryption Communication Kit

Author(s): Benito Fernandez, Jose Capriles, Carlos Garcia
Published on Aug 14, 2016
Download Report as PDF

In this research, we have developed successfully a software-only prototype of the technology that could be used (as-is) as a potential product to be beta-tested within UT. In addition to the prototype, the long term vision of this project includes hardware components inside a secure hardware-software architecture. For instance, the architecture could be included inside communications devices currently available in the market (e.g., iPhone) in order to offer a new option for data protection.

The roadmap also includes the following features:

  • An “infinite key” generation algorithm.
  • Software-reconfigurable hardware.
  • On-demand selection of Chaotic oscillators and mixed-signal processes.
  • Cloud-enabled service.

Since our solution ultimately will be implemented in hardware, it will execute faster than current techniques of similar complexity. The several layers of complexity offered by this inven- tion may thwart an attack or convert it into an ineffective endeavor. This research has delivered a crypto-tool that uses keys generated by an integrated piece of software that simulates chaotic oscillators with mixed-signal integrators. 

Identity Theft Assessment and Prediction

Author(s): Jennifer Miller, K. Suzanne Barber
Published on Aug 10, 2016
Download Report as PDF

The ITAP, or Identity Theft Assessment and Prediction Tool, is a tool that allows computational representation and quantitative measurement to better understand a fraudster’s behaviors and inevitably, make connections and visualize patterns of those behaviors.  How are fraudsters accessing information, i.e. through what vulnerabilities?  What tools are they using in order to overcome security hurdles?  What steps are they taking? Behavior patterns and trends identified in the ITAP will answer these and many more questions to include the entry points, vulnerabilities and consequences of fraudsters.

Next Page

Sign Up for CID News