This website uses cookies to ensure you get the best experience on our website. Learn more
Defending Your Network from RockYou2021
In June 2021, a large data dump was posted to a popular internet hacking forum. This dataset was termed “rockyou2021,” named after the popular password brute-force wordlist known as Rockyou.txt.
Media and Twitter alike were abuzz with what to do about RockYou2021. You would not be alone if you were wondering if or how you should protect your network from RockYou2021. We asked our research team to do a deep dive on the dataset, the results of which we’re sharing in this post. And while some on Twitter were advising this dataset was full of junk data that didn’t need any action, our team’s verdict wasn’t quite the same.
What’s in Rockyou2021?
The intent of this dataset was to be used to assist in the brute-force attacks on password hashes with the goal of finding a password in the wordlist to log into the service or system that the hash protects. This dataset was described as a combination of “COMB” (Collection of Many Breaches), and wordlists generated from Wikipedia, and other sources.
Since this dataset is a wordlist, rather than a dump of existing credentials from existing sources (COMB records aside), no usernames are paired with these records. The dataset is simply a wordlist to be used as possible passwords for brute-force, or cracking attempts.
Our team’s analysis of RockYou2021
An analysis was performed on the rockyou2021 wordlist; this analysis was completed using standard text-manipulation tools in order to collect subsets, as well as the records were randomly shuffled, split into subsets, and then processed to calculate appropriate statistics.
For the purposes of the analysis (generating password statistics around the records in question), a subset of ~ 200m records was taken from the complete dataset of ~8.5b records, giving a sample of ~ 2.4%.
For demonstration purposes, the following are a sample of records taken from the complete dataset:
password331193 |
password762803 |
password7487 |
passwords7288 |
passworded8206 |
passwords1037 |
qwertyu098 |
qwertyUYTREW!3579- |
pandazzqwerty |
qwertypoi098 |
9qwertysylvia |
qwerty7890-=-97531 |
qwerty5885946588594 |
Efetiiloveyou? |
Abcdeiloveyou |
omailoveyou11 |
8809iloveyou |
9adamiloveyou |
6395iloveyou! |
lissailoveyou |
8iloveyouu |
The distribution of popular common passwords as ‘base-words’ (words used in combination with letters/numbers/punctuation, or modified via casing or ‘leet-speak’) appears as follows:

Note that the large count of passwords based on ‘123456’ is due to the many variations of number-based passwords. A lot of common passwords are based on number patterns, and those would be considered a superset of this pattern; for example, 123456789 would have ‘123456’ as a ‘base-word’ of the password. Because of that, these groupings can be partially misleading when considering passwords consisting of only numbers. One should strive, when generating a password policy, to prevent passwords and passphrases consisting of only numbers, or only letters, encouraging sufficient entropy via other characters and randomness. A similar pattern is seen with the reliance on ‘qwerty’ as a base-word in the dataset; weak passwords trend towards “keyboard-walking” patterns, since many users find them easy to remember, increasing their frequency in leaks and similar datasets.
An analysis of the lengths of records displays a similar pattern, permutations on common passwords (a part of generating a good wordlist) results in passwords that trend towards higher length; a password permutated on a keyboard walk such as ‘qwerty’ will be longer than the base-word itself.

The dataset trends towards longer passwords, necessitating the enforcement of either harder to remember longer passwords to avoid collisions with the wordlist, or optimally, the use of passphrases.
Our team also took a look at the complexity of the RockYou2021 records. Below you can find the breakdown of how many records fall into different complexity types as well as some examples.
Complexity Type: Lowercase letters and numbers (loweralphanum)
RockYou2021 record count: 34,296,199 (34.06%)
Examples: sta8342, residerais6
Complexity Type: Lower and uppercase letters with numbers (mixedalphanum)
RockYou2021 record count: 20,526,308 (20.38%)
Examples: BEllow2588, peDiortho95
Complexity Type: Lowercase letters only (loweralpha)
RockYou2021 record count: 15,398,980 (15.29%)
Examples: nadajuez, namchaithailand
Complexity Type: Lowercase letters and special characters (loweralphaspecial)
RockYou2021 record count: 5394563 (5.36%)
Examples: pimbava-os, @mb@|it
Complexity Type: Lower and uppercase letters and special characters (mixedalphaspecial)
RockYou2021 record count: 2,432,456 (2.42%)
Examples: All’Arrabbiatela, Baker_tentb
Complexity Type: Lower and Uppercase letters (mixedalpha)
RockYou2021 record count: 6,737,899 (6.69%)
Examples: DenisedeRidder, BlackMightyWax
Complexity Type: Uppercase letters and numbers (upperalphanum)
RockYou2021 record count: 5,044,179 (5.01%)
Examples: CIZAWOVY1, EDUARDO6592
Complexity Type: Uppercase letters and special characters (upperalphaspecial)
RockYou2021 record count: 284,279 (0.28%)
Examples: ALTNØGLEN, ATMOSF{{RIIN
Complexity Type: Lowercase letters, special characters, and numbers (loweralphaspecialnum)
RockYou2021 record count: 3,811,000 (3.78%)
Examples: rhs;ysq52, promu|gat3
Complexity Type: Numbers only (numeric)
RockYou2021 record count: 3,303,380 (3.28%)
Examples: 66748719, 87925501
Complexity Type: Lower and uppercase letters, special characters, and numbers (mixedalphaspecialnum)
RockYou2021 record count: 1,582,514 (1.57%)
Examples: D3PR3Da7!0NS, 75Henri-
Complexity Type: Uppercase letters only (upperalpha)
RockYou2021 record count: 1,154,030 (1.15%)
Examples: ATTRATIV, ENBOSTADSHUS
Complexity Type: Uppercase letters, special characters, and numbers (upperalphaspecialnum)
RockYou2021 record count: 462,041 (0.46%)
Examples: 9753(OL>@$^*, <MNBGJL”_098
Complexity Type: Special characters and numbers (specialnum)
RockYou2021 record count: 274,758 (0.27%)
Examples: @12345678910111213@, 8#####@*_*_*_(0-0)
The above breakdown indicating that adding most of RockYou2021 to a breached password protection list is not required as sufficient complexity rules could protect against over 95% of all records in RockYou2021. By simply requiring upper, lower, numbers, and special characters, one would rule out a valid password being contained in the following categories (comprising of 96.5% of our sample).
Password Policy Recommendations for RockYou2021
There is no one-size-fits-all password policy recommendation for organizations looking to prevent attacks making use of the RockYou2021 list. Each organization will have different compliance needs and security concerns.
However, the strongest defense against an attempt to brute force or crack hashes using this wordlist, would be to use sufficiently long passphrases, or sufficiently long complex strings. As recommended in the NIST Special Publication 800-63B, section 5.1.1.2,
“ Verifiers SHALL require subscriber-chosen memorized secrets to be at least 8 characters in length. Verifiers SHOULD permit subscriber-chosen memorized secrets at least 64 characters in length.”
If looking to simply make use of password length as a defense, organizations could simply require long passwords or passphrases. The majority of RockYou2021 records had less than 22 characters and most records on the longer end of the range were not human readable. Organizations could take the approach of encouraging the use of passphrases, requiring a minimum of that many characters or set a lower minimum but incentivize longer passwords with length-based password aging in Specops Password Policy.
Another approach is to combine a length requirement alongside character requirements. After reviewing the password length analysis in RockYou2021, and the complexity of records contained in this dataset; our team found that using a strong password policy requiring 16 characters or more, and encouraging higher entropy in the passphrase, such as some capital letters, or other complex characters would rule out more than 95% of records on the wordlist. This is not only a defense against the cracking of hashes (or brute-forcing) via wordlists such as Rockyou2021, but it is also a defense against brute-forcing these records; the longer the secret, and the higher the entropy, the more costly (and therefore less likely to succeed) the brute-force attack.
A password rule requiring mixed alpha, numbers, and special characters (preferably a complex passphrase) could simply rule out the dataset. By setting a minimum password length, and using a complexity rule building on the required complexity classes, one is able to prevent users from creating passwords weak to this kind of bruteforce wordlist generation. Using Specops Password Policy to configure such a policy:

At the end of the day, RockYou2021 was not a large dump of breached passwords (though it did contain some). However, it is still a wordlist attackers may choose to use in their attacks against your network.
The use of either Specops Password Policy, or an equivalent password filter to enforce sound password policies, is the best defense against attacks with these types of datasets. Combined with a sound leaked password breach protection service, such as Specops Breached Password Protection, organizations can raise the level of effort required for attackers to breach their networks via a password attack.
(Last updated on December 11, 2024)
Related Articles
-
Rockyou2024 analysis: Mega password list or just noise?
Back in June 2021, a large data dump called ‘rockyou2021’ was posted on a popular hacking forum. It was named after the popular password list used in brute-force attacks called ‘Rockyou.txt’ – and it was a pretty big story at the time. You can see our team’s analysis on it here. Fast forward to 2024…
Read More -
Specops Breached Password Protection Expands with the Addition of Outpost24 Threat Intelligence Malware-Stolen Password Data
This expansion coincides with the publication of the 3rd annual Specops Breached Password report. Today, Specops Software announced the addition of a new source of compromised password data for the Specops Breached Password Protection service used by Specops Password Policy. This new source of compromised password data is powered by the threat intelligence unit of…
Read More -
The Force won’t save you from these breached passwords #StarWarsDay
If your colleagues are Star Wars fans, they might be at risk for breached password use. On May the 4th, the unofficial Star Wars fandom holiday, Specops Software investigated which Star Wars themed passwords were most popular in breached password lists. This new research also coincides with the latest update to the Specops Breached Password…
Read More