Center Project Reports

The Effect of the GDPR on Privacy Policies- Recent Progress and Future Promise

Author(s): Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on May 14, 2020
Download Report as PDF

The General Data Protection Regulation (GDPR) is considered by some to be the most important change in data privacy regulation in 20 years. Effective May 2018, the European Union GDPR privacy law applies to any organization that collects and processes the personal information of EU citizens within or outside the EU. In this work, we seek to quantify the progress the GDPR has made in improving privacy policies around the globe. We leverage our data mining tool, PrivacyCheck, to automatically compare three corpora (totaling 550) of privacy policies, pre- and post-GDPR. In addition, to evaluate the current level of compliance with the GDPR around the globe, we manually studied the policies within two corpora (450 policies). We find that the GDPR has made progress in protecting user data, but more progress is necessary—particularly in the area of giving users the right to edit and delete their information—to entirely fulfill the GDPR’s promise. We also observe that the GDPR encourages sharing user data with law enforcement, and, as a result, many policies have facilitated such sharing after the GDPR. Finally, we see that, when there is non-compliance with the GDPR, it is often in the form of failing to explicitly indicate compliance, showing an organization’s lack of transparency and disclosure regarding their processing and protection of personal information. If Personally Identifiable Information (PII) is the “currency of the Internet”, these findings mark continued alarm regarding an individual’s agency to protect and secure their PII assets.

Comparing Privacy Policies of Government Agencies and Companies- a Study Using Privacy Policy Analysis Tools

Author(s): Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on May 14, 2020
Download Report as PDF

Companies and government agencies are subject to distinct regulations that govern their collection and use of personally identifiable information. Yet, do privacy policies of companies and government agencies reflect this distinction? In this paper, we take advantage of two of the most recent automatic privacy policy analysis tools, Polisis and PrivacyCheck, and five corpora of over 800 privacy policies to answer this question. We discover that government agencies are considerably better in protecting (or not collecting for that matter) sensitive financial information, social security numbers, and user location. On the other hand, many of them fail to directly address children’s privacy or describe security measures taken to protect user data. Furthermore, we observe the positive effect of European regulation, such as the GDPR, on European government agencies. E.U government agencies perform well, with respect to notifying users of policy change, giving users the right to edit/delete their data, and limiting data retention— all of which are GDPR tenets. Our work sheds light on the actual effect of regulating privacy policies, paves the way for lawmakers to improve such regulation, and assists the research community in enhancing the usability of privacy policies through studying their trends.

Identifying Real-World Credible Experts in the Financial Domain to Avoid Fake News

Author(s): Teng-Chieh Huang , Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on May 14, 2020
Download Report as PDF

Establishing a solid mechanism for finding credible and trustworthy people in online social networks is an important first step to avoid useless, misleading or even malicious information. Social network users can hide their intention or fabricate their virtual personality to gain trust of others. There is a body of existing work studying trustworthiness of social media users and finding credible sources in specific target domains. However, most of the related work lack the connection between the credibility in the real-world and credibility on the Internet, which makes the formation of social media credibility and trustworthiness incomplete. In this paper, working in the financial domain, we identify attributes that can distinguish credible users on the Internet who are indeed trustworthy experts in the real-world. To ensure objectivity, we gather the list of credible financial experts from real-world financial authorities. By analyzing the distribution of attributes of social media users using the random forest classifier, we can find which attributes are related to real-world expertise, and which attributes have higher potential of being forged by malicious users.

Is Your Phone You? How Privacy Policies of Mobile Apps Allow the Use of Your Personally Identifiable Information

Author(s): Kai Chih Chang , Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on May 11, 2020
Download Report as PDF

People continue to store their sensitive information in their smart-phone applications, knowingly or more often unknowingly. Users seldom read an app’s privacy policy to see how their information is being collected, used, and shared. In this paper, using a reference list of over 600 Personally Identifiable Information (PII) attributes, we investigate the privacy policies of 100 popular health and fitness mobile applications in both Android and iOS app markets to find the set of personal information these apps collect, use and share. The reference list of PII was independently built from a longitudinal study at The University of Texas investigating thousands of identity theft and fraud cases where PII attributes and associated value and risks were empirically quantified. This research leverages the reference PII list to identify and analyze the value of personal information collected by the mobile apps and the risk of disclosing this information. We found that the set of PII collected by these mobile apps covers 35% of the entire reference set of PII and, due to dependencies between PII attributes, these mobile apps have a likelihood of indirectly impacting 70% of the reference PII if breached. For a specific app, we discovered the monetary loss could reach $1M if the set of sensitive data it collects is breached. We finally utilize Bayesian inference to measure risks of a set of PII gathered by apps: the probability that fraudsters can discover, impersonate and cause harm to the user by misusing only the PII the mobile apps collected.

An Assessment of Blockchain Identity Solutions: Minimizing Risk and Liability of Authentication

Author(s): Rima Rana, Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on Aug 14, 2019
Download Report as PDF

Personally Identifiable Information (PII) is often used to perform authentication and acts as a gateway to personal and organizational information. One weak link in the architecture of identity management services is sufficient to cause exposure and risk identity. Recently, we have witnessed a shift in identity management solutions with the growth of blockchain. Blockchain—the decentralized ledger system— provides a unique answer addressing security and privacy with its embedded immutability. In a blockchain-based identity solution, the user is given the control of his/her identity by storing personal information on his/her device and having the choice of identity verification document used later to create blockchain attestations. Yet, the blockchain technology alone is not enough to produce a better identity solution. The user cannot make informed decisions as to which identity verification document to choose if he/she is not presented with tangible guidelines. In the absence of scientifically created practical guidelines, these solutions and the choices they offer may become overwhelming and even defeat the purpose of providing a more secure identity solution.

"Understanding victim-enabled identity theft," D. Lacey, J. Zaiss and K. S. Barber, 2016 14th Annual Conference on Privacy, Security and Trust (PST), Auckland, 2016, pp. 196-202.

Author(s): David Liau, Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on Aug 14, 2019
Download Report as PDF

Today, more than ever, everyday authentication processes involve combinations of Personally Identifiable Information (PII) to verify a person’s identity. Meanwhile the number of identity thefts is increasing dramatically compared to the past decades. As a response to the phenomenon, numerous of privacy protection regulations, management frameworks and companies thrives luxuriantly in the industry as well. In this paper, we leverage previous work in the Identity Ecosystem, a Bayesian network mathematical representation of a person’s identity, to create a framework to evaluate identity protection systems. After reviewing the Identity Ecosystem, we populate a dynamic version of it and propose a protection game for a person’s PII given that the owner and the attacker both gain some level of control over the status of other PIIs within the dynamic Identity Ecosystem. We first present the game concept as a single round game with complete information. Then we formulate a stochastic shortest path game between the owner and the attacker on the dynamic Identity Ecosystem. The attacker is trying to expose the target PII as soon as possible while the owner is trying to protect the target PII from being exposed. We present a policy iteration algorithm to solve the optimal policy for the game and discuss its convergence. Finally, an evaluation and comparison of identity protection strategies is provided given that an optimal policy is used against different protection policies. This study is aimed to understand the evolutionary process of identity theft and provide a framework for evaluating different identity protection strategies.

Statistical Analysis of Identity Risk of Exposure and Cost Using the Ecosystem of Identity Attributes

Author(s): Chia-Ju Chen, Razieh Nokhbeh Zaeem, K. Suzanne Barber
Published on Aug 14, 2019
Download Report as PDF

Personally Identifiable Information (PII) is often called the “currency of Internet” as identity assets are collected, shared, sold, and used for almost every transaction on the Internet. PII is used for all types of applications from access control to credit score calculations to targeted advertising. Every market sector relies on PII to know and authenticate their customers and their employees. With so many businesses and government agencies relying on PII to make important decisions and so many people being asked to share personal data, it is critical to better understand the fundamentals of identity to protect it and responsibly use it. Previously developed comprehensive Identity Ecosystem utilizes graphs to model PII assets and their relationships and is powered by empirical data from almost 6,000 real-world identity theft and fraud news reports to populate the UT CID Identity Ecosystem. We obtained UT CID Identity Ecosystem from its authors to analyze using graph theory. We report numerous novel statistics using identity asset content, structure, value, accessibility, and impact. Our work sheds light on how identity is used and paves the way for improving identity

2019 ITAP Report

Published on Jul 30, 2019
Download Report as PDF

The Identity Threat Assessment and Prediction (ITAP) model and analytics provide unique, research-based insights into the habits and methods associated with identity threats, and into the various factors that contribute to higher levels of risk for the compromise and abuse of personally identifiable information (PII).  ITAP uncovers the identity attributes most vulnerable to compromise, assesses their importance, and identifies the types of PII most frequently targeted by thieves and fraudsters.

The analytical repository of ITAP offers valuable understanding of the actors, organizations, and devices involved in identity threats -- across multiple domains, including financial services, consumer services, healthcare, education, law enforcement, communications, and government.  ITAP characterizes the current identity threat landscape and aims to predict future identity threats.  Using a wealth of data and analytics, ITAP delivers concrete guidance for consumers, businesses, and government agencies on how to avoid or lessen the impact of identity theft, fraud, and abuse. In sum, ITAP delivers actionable knowledge grounded in analyses of past threats and countermeasures, current threats and solutions, and evidence-driven forecasts.

During 2018 and into 2019, the ITAP team focused primarily on adding international (i.e. non-US) incidents to the model.  There are now about 900 international incidents captured in ITAP, making up 16% of the total number.  Of the international cases, 95% were localized to a given country, while the remaining 5% were multi-national (or even worldwide) in scope.  This recent focus has expanded the breadth of the project, and enabled us to implement new analytics based on international incidents, including some that compare the effects of PII-compromise across different countries.  Unlike in previous annual ITAP reports, all of the charts in this 2019 ITAP Report are based purely on the international cases.  

The Identity Ecosystem

Author(s): Razieh Nokhbeh Zaeem, David Liau, Suratna Budalakoti, K. Suzanne Barber
Published on Jul 3, 2019
Download Report as PDF

As identity theft, fraud, and abuse continue to grow in terms of both scope and impact, individuals and organizations alike demand a deeper understanding of their vulnerabilities, risks, and resulting consequences. To address this demand, we present the Identity Ecosystem, a novel Bayesian model of Personal, Organizational, and Device Identifiable Information (PII/OII/DII) attributes and their relationships. We populate the Identity Ecosystem model with real-world data from approximately 6,000 reported identity theft and fraud cases. We leverage this populated model to provide unique, research-based insights into the variety of PII/OII/DII, their properties, and how they interact. Informed by the real-world data, we investigate the ecosystem of identifiable information in which criminals compromise PII/OII/DII and misuse them.
We built the Identity Ecosystem into an online tool that answers sophisticated queries. As an example query, it predicts future risk and losses of losing a given set of PII and the liability associated with its fraudulent use. In the Bayesian model, each PII (e.g., Social Security Number) or OII (e.g., Employer Identification Number) or DII (e.g., IP Address) is modeled as a graph node. Probabilistic relationships between these attributes are modeled as graph edges. We leverage this Bayesian Belief Network to approximate the posterior probabilities of the model, assuming the given set of PII attributes is compromised, to answer the query.
Hence, the Identity Ecosystem uncovers the identity attributes most vulnerable to theft, assesses their importance, and determines not only the PII but also the OII and DII most frequently targeted by thieves and fraudsters. The insights the Identity Ecosystem provides are significant, valuable, and sometimes very nonintuitive.

Enhancing and Evaluating Identity Privacy and Authentication Strength by Utilizing the Identity Ecosystem

Author(s): Razieh Nokhbeh Zaeem, K. Suzanne Barber, Kai Chih Chang
Published on Apr 15, 2019
Download Report as PDF

This paper presents a novel research model of identity and the use of this model to answer some interesting research questions. Information travels in the cyber world, not only bringing us convenience and prosperity but also jeopardy. Protecting this information has been a commonly discussed issue in recent years. One type of this information is Personally Identifiable Information (PII), often used to perform personal authentication. People often give PIIs to organizations, e.g., when applying for a new job or filling out a new application on a website. While the use of such PII might be necessary for authentication, giving PII increases the risk of its exposure to criminals. We introduce two innovative approaches based on our model of identity to help evaluate and find an optimal set of PIIs that satisfy authentication purposes but minimize risk of exposure. Our model paves the way for more informed selection of PIIs by organizations that collect them as well as by users who offer PIIs to these organizations.

Next Page

Sign Up for CID News