2017 ITAP ReportAuthor(s): The Center for Identity
Published on Apr 20, 2017
Research shows that only a tiny percentage of users actually read the online privacy policies we all implicitly agree to when using a website. It also suggests that users ignore privacy policies because they are lengthy and, on average, require two years of college education to comprehend. We propose a novel technique that tackles this problem by automatically extracting graphical summaries of online privacy policies. We use data mining models to analyze the text of privacy policies and answer ten basic questions concerning the privacy and security of user data, what information is gathered from them, and how this information is used.
Identity theft is perhaps the deﬁning crime of the information age. Identity theft threatens various demographics, but some age groups, e.g., senior citizens, are particularly vulnerable. In this paper, we study how identity theft puts different personally identiﬁable information (PII) assets at risk of exposure, and how this risk changes throughout one’s lifecycle. We categorize PII assets, introducing a fourth novel category, measure their exposure risk using the Identity Theft Assessment and Prediction (ITAP) repository of over 3,000 identity theft cases, and track the risk change throughout an individual’s lifecycle. We introduce the concept of PII Balance SheetsTM, and ﬁnally, we present a free publicly available Android app that demonstrates our research results. This app not only educates individuals and highlights their vulnerable identity assets, but also is useful when they make the decision of whether or not to share their personally identiﬁable information.
We sought to use the temporal dynamics of keyboarding during natural computer typing as an indicator of identity and health status. We first developed novel keystroke logging software in Python. We then analyzed the hold times (time between keydown and keyup for each key) and flight times (time between subsequent key presses) for two healthy individuals and found that 1) hold times and flight times differ significantly (p < 0.001) between these individuals, and 2) hold times for an individual are consistent across different times of day and different days. We then acquired typing data from patients with Parkinson’s disease (PD) (n=16) and from elderly controls (n=15) who typed for 15 minutes each, copying the same passage of text. None of the features we tested correlated with PD motor symptom severity. This study represents a proof-of-principle that keyboarding has promise for both identity and diagnostic purposes.
The unicity study has particularly important implications for privacy and the increasing availability of ‘anonymized’ trajectory datasets. This is one of the first studies to explore unicity and anonymity with higher resolution GPS data and it should be troubling how unique a set of two location points can be. Decreasing the spatial and temporal resolution reduces the unicity, but five points with x,y coordinates at the coarsest resolution tested here were still uniquely associated with a single trajectory more than 60% of the time.
Our results demonstrate that evaluating the expected improvement in performance yield an ability to select generally good acquisitions in a cost-effective manner. Several policies yield comparable performance. These results suggest that our policies are able to identify acquisition costs that yield labeling quality to produced the desired improvement in performance.
In this research, we have developed successfully a software-only prototype of the technology that could be used (as-is) as a potential product to be beta-tested within UT. In addition to the prototype, the long term vision of this project includes hardware components inside a secure hardware-software architecture. For instance, the architecture could be included inside communications devices currently available in the market (e.g., iPhone) in order to offer a new option for data protection.
The roadmap also includes the following features:
Since our solution ultimately will be implemented in hardware, it will execute faster than current techniques of similar complexity. The several layers of complexity offered by this inven- tion may thwart an attack or convert it into an ineffective endeavor. This research has delivered a crypto-tool that uses keys generated by an integrated piece of software that simulates chaotic oscillators with mixed-signal integrators.
The ITAP, or Identity Theft Assessment and Prediction Tool, is a tool that allows computational representation and quantitative measurement to better understand a fraudster’s behaviors and inevitably, make connections and visualize patterns of those behaviors. How are fraudsters accessing information, i.e. through what vulnerabilities? What tools are they using in order to overcome security hurdles? What steps are they taking? Behavior patterns and trends identified in the ITAP will answer these and many more questions to include the entry points, vulnerabilities and consequences of fraudsters.
Identity theft, fraud, and abuse are problems affecting all market sectors in society. Identity theft is often a gateway crime, as criminals use stolen or fraudulent identities to steal money, claim eligibility for services, hack into networks without authorization, and so on. The available data describing identity crimes and their aftermath is often in the form of recorded stories and reports by the news press, fraud examiners, and law enforcement. All of these sources are unstructured. Hence, in order to analyze identity theft data, this research proposes an approach which involves the collection of online news stories and reports on the topic of identity theft. Our approach preprocesses the raw text and extracts semi-structured information automatically, using text mining techniques. This paper presents statistical analysis of behavioral patterns and resources used by thieves and fraudsters to commit identity theft, including the identity attributes commonly linked to identity crimes, resources thieves employ to conduct identity crimes, and temporal patterns of criminal behavior. Analyses of these results increase empirical understanding of identity threat behaviors, offer early warning signs of identity theft, and thwart future identity theft crimes.
Over ninety percent of all data in the world has been collected in the past few years alone.1 As technology becomes increasingly integral to our lives, we create an incredible and unprecedented amount of personal data. There is more data than ever before and, consequently, more digital information about out lives.