Data Breaches Visualization

Visualization Link:
World's Biggest Data Breaches & Hacks

Author:
Saumick Pradhan and Nachiket Dighe

What is the purpose of this visualization?
World’s Biggest Data Breaches and Hacks Data Visualization shows a variety of platforms and the data breaches they’ve suffered. A hover and hide effect expands each bubble, giving further details into these major hacks. It has Selected events over 30,000 records stolen, current as of Jan 2024. Breaches with interesting stories are also represented in the visualization.

What is the data?
Data set: Data Breaches and Hacks

Here are the attributes described:

How was the data collected?
Sources: IdTheftCentre: https://www.idtheftcenter.org, DataBreaches.net: https://www.databreaches.net, news reports

Who are the users that this visualization was made for?
Researchers, analysts, students, general public.

What questions do people want to ask about this data?
There are two types of visualizations:

How can they find the answers with this tool?
Users can find the answers with this tool by hovering over the different clouds, and then clicking on the article that comes up. This is done by:


And then hovering over the cloud with the answers to your question, for example, let us say I want to know what information about the Indonesian SIM cards were leaked, I will hover to the relevant cloud, which immediately expands with information:


Click the link to get to the article


Number of people affected is also seen



Interesting stories are highlighted in yellow



Severity of the data breach is also highlighted in the color of the cloud



The data breach is sorted by year, and the size of the cloud is dependent on the size of the data breach


Events can be searched by typing



Show some example insights someone can arrive at using this tool
Some insights that someone can arrive at after using this tool include:

What design choices are effective?
Effective design choices include:

What are the limitations of this design- what can't someone do with this visualization?

The design does not have a lot of limitation, but there are a few more things that can be implemented, such as:


Are there any design choices that are not effective, and how could they be improved?

The design choices for the word cloud are very effective, and the data is easy to read and understand. There are a few improvements, however, that could be implemented: