Personal Data Early Warning System: Machine Learning Models Extract Identity Theft and Fraud Trends from News

Member for

4 years 8 months
Full name
Ryan Anderson
Abstract
Show Abstract Each year, cyber attacks pose a greater and greater risk to consumer personal information stored by corporations and government agen-cies. Billions of consumer records are breached each year and data breaches compromise the personal data of hundreds of millions of citizens. These breaches are extremely costly–financially and in terms of privacy and reputation–to people (through identity theft and fraud) and to companies (through the abuse of their collected in-formation for which they are accountable). What is more, the theft of data often acts as a gateway in the complex and interdependent ecosystem of personal data. Personally Identifiable Information (PII) is breached to gain access and steal more PII in a chain of events and tactics. Therefore, there is a need to build tools to help people and businesses navigate the dangerous waters of identity theft and fraud. The cyber world, however, is an evolving landscape and trends change often. People and organizations need to have a current and accurate situational awareness understanding trends such as common breach threats and tactics, types of data most frequently attacked, and personal information most often exposed with the highest negative consequences. Enter the Personal Data Early Warning System (PDEWS), an online dashboard that tracks and displays the current cyber threat landscape and generates actionable insight into trends and pat-terns. PDEWS exists as an automated pipeline, collecting data each day about ongoing cyber threats. There are four major phases of PDEWS. First, PDEWS prowls through daily identity theft and fraud news stories and scrapes the body text. Then it formats the text into the representation required for a machine learning application and places that text in an Amazon Web Services cloud infrastructure. Next, PDEWS applies machine learning models trained on a private identity theft article corpus to extract relevant threat la-bels. Finally, PDEWS displays those trends on an online dashboard alongside recommendations researched to have the greatest mitigation capabilities against the current threat landscape.

 

Access Publication: Download PDF of Report

Downloads
/sites/default/files/2021-08/Personal%20Data%20Early%20Warning%20System-%20Machine%20Learning%20Models%20Extract%20Identity%20Theft%20and%20Fraud%20Trends%20from%20News.pdf
Display Title

Personal Data Early Warning System: Machine Learning Models Extract Identity Theft and Fraud Trends from News
Razieh Nokhbeh Zaeem, K. Suzanne Barber, Jessica Cruz-Nagoski, Luke Norrell, Michael Sullivan, Jonathan Walsh, Dylan Wolford, Yasira Younus, UT CID Report #21-04, August 2021