Center Project Reports

2017 ITAP Report

Author(s): The Center for Identity
Published on Apr 20, 2017
Download Report as PDF

PrivacyCheck: Automatic Summarization of Privacy Policies Using Data Mining

Author(s): Razieh Nokhbeh Zaeem, Rachel German, K. Suzanne Barber
Published on Aug 14, 2016
Download Report as PDF

Research shows that only a tiny percentage of users actually read the online privacy policies we all implicitly agree to when using a website. It also suggests that users ignore privacy policies because they are lengthy and, on average, require two years of college education to comprehend. We propose a novel technique that tackles this problem by automatically extracting graphical summaries of online privacy policies. We use data mining models to analyze the text of privacy policies and answer ten basic questions concerning the privacy and security of user data, what information is gathered from them, and how this information is used.

In order to train the data mining models, we thoroughly study privacy policies of 400 companies (7% of all listings on NYSE, Nasdaq, and AMEX stock markets) across industries. Our free Chrome browser extension,PrivacyCheck, utilizes the data mining models to summarize any HTML page that contains a privacy policy. PrivacyCheck stands out from currently available counterparts because it is readily applicable on any online privacy policy. Experimental results show that PrivacyCheck summaries are accurate 60% of the time. Over 350 independent Chrome users are currently using PrivacyCheck.

Risk Kit: Highlighting Vulnerable Identity Assets for Specific Age Groups

Author(s): Razieh Nokhbeh Zaeem, Monisha Manoharan, K. Suzanne Barber
Published on Aug 14, 2016
Download Report as PDF

Identity theft is perhaps the defining crime of the information age. Identity theft threatens various demographics, but some age groups, e.g., senior citizens, are particularly vulnerable. In this paper, we study how identity theft puts different personally identifiable information (PII) assets at risk of exposure, and how this risk changes throughout one’s lifecycle. We categorize PII assets, introducing a fourth novel category, measure their exposure risk using the Identity Theft Assessment and Prediction (ITAP) repository of over 3,000 identity theft cases, and track the risk change throughout an individual’s lifecycle. We introduce the concept of PII Balance SheetsTM, and finally, we present a free publicly available Android app that demonstrates our research results. This app not only educates individuals and highlights their vulnerable identity assets, but also is useful when they make the decision of whether or not to share their personally identifiable information.

Keystroke Analytics for Non-Invasive Diagnosis of Neurodegenerative Disease

Author(s): Andrew Ellington, Tim Riedel, Dan Winkler, Emily Knight
Published on Aug 14, 2016
Download Report as PDF

We sought to use the temporal dynamics of keyboarding during natural computer typing as an indicator of identity and health status. We first developed novel keystroke logging software in Python. We then analyzed the hold times (time between keydown and keyup for each key) and flight times (time between subsequent key presses) for two healthy individuals and found that 1) hold times and flight times differ significantly (p < 0.001) between these individuals, and 2) hold times for an individual are consistent across different times of day and different days. We then acquired typing data from patients with Parkinson’s disease (PD) (n=16) and from elderly controls (n=15) who typed for 15 minutes each, copying the same passage of text. None of the features we tested correlated with PD motor symptom severity.  This study represents a proof-of-principle that keyboarding has promise for both identity and diagnostic purposes.

A Computational Movement Analysis Framework for Exploring Anonymity in Human Mobility Trajectories

Author(s): Jennifer Miller
Published on Aug 14, 2016
Download Report as PDF

The unicity study has particularly important implications for privacy and the increasing availability of ‘anonymized’ trajectory datasets. This is one of the first studies to explore unicity and anonymity with higher resolution GPS data and it should be troubling how unique a set of two location points can be. Decreasing the spatial and temporal resolution reduces the unicity, but five points with x,y coordinates at the coarsest resolution tested here were still uniquely associated with a single trajectory more than 60% of the time. 

Economic Machine Learning for Fraud Detection

Author(s): Maytal Saar-Tesechansky
Published on Aug 14, 2016
Download Report as PDF

Our results demonstrate that evaluating the expected improvement in performance yield an ability to select generally good acquisitions in a cost-effective manner. Several policies yield comparable performance. These results suggest that our policies are able to identify acquisition costs that yield labeling quality to produced the desired improvement in performance.

Chaotic Hybrid Encryption Communication Kit

Author(s): Benito Fernandez, Jose Capriles, Carlos Garcia
Published on Aug 14, 2016
Download Report as PDF

In this research, we have developed successfully a software-only prototype of the technology that could be used (as-is) as a potential product to be beta-tested within UT. In addition to the prototype, the long term vision of this project includes hardware components inside a secure hardware-software architecture. For instance, the architecture could be included inside communications devices currently available in the market (e.g., iPhone) in order to offer a new option for data protection.

The roadmap also includes the following features:

  • An “infinite key” generation algorithm.
  • Software-reconfigurable hardware.
  • On-demand selection of Chaotic oscillators and mixed-signal processes.
  • Cloud-enabled service.

Since our solution ultimately will be implemented in hardware, it will execute faster than current techniques of similar complexity. The several layers of complexity offered by this inven- tion may thwart an attack or convert it into an ineffective endeavor. This research has delivered a crypto-tool that uses keys generated by an integrated piece of software that simulates chaotic oscillators with mixed-signal integrators. 

Identity Theft Assessment and Prediction

Author(s): Jennifer Miller, K. Suzanne Barber
Published on Aug 10, 2016
Download Report as PDF

The ITAP, or Identity Theft Assessment and Prediction Tool, is a tool that allows computational representation and quantitative measurement to better understand a fraudster’s behaviors and inevitably, make connections and visualize patterns of those behaviors.  How are fraudsters accessing information, i.e. through what vulnerabilities?  What tools are they using in order to overcome security hurdles?  What steps are they taking? Behavior patterns and trends identified in the ITAP will answer these and many more questions to include the entry points, vulnerabilities and consequences of fraudsters.

Modelling and Analysis of Identity Threat Behaviors Through Text Mining of Identity Theft Stories

Author(s): Yongpeng Yang, Monisha Manoharan, K. Suzanne Barber
Published on Aug 8, 2016
Download Report as PDF

Identity theft, fraud, and abuse are problems affecting all market sectors in society. Identity theft is often a gateway crime, as criminals use stolen or fraudulent identities to steal money, claim eligibility for services, hack into networks without authorization, and so on. The available data describing identity crimes and their aftermath is often in the form of recorded stories and reports by the news press, fraud examiners, and law enforcement. All of these sources are unstructured. Hence, in order to analyze identity theft data, this research proposes an approach which involves the collection of online news stories and reports on the topic of identity theft. Our approach preprocesses the raw text and extracts semi-structured information automatically, using text mining techniques. This paper presents statistical analysis of behavioral patterns and resources used by thieves and fraudsters to commit identity theft, including the identity attributes commonly linked to identity crimes, resources thieves employ to conduct identity crimes, and temporal patterns of criminal behavior. Analyses of these results increase empirical understanding of identity threat behaviors, offer early warning signs of identity theft, and thwart future identity theft crimes.

The Danger of Putting Your Digital Life in One Place

Published on Aug 8, 2016
Download Report as PDF

Over ninety percent of all data in the world has been collected in the past few years alone.1 As technology becomes increasingly integral to our lives, we create an incredible and unprecedented amount of personal data. There is more data than ever before and, consequently, more digital information about out lives.

Next Page

Sign Up for CID News