Password leaks are 90% fake
2024-02-03 - infosecDigging through the most popular password 'leak' compilations for a side project and it's >90% fake.
Only a tiny fraction of those passwords may have been used by a human at some point, the rest is computer-generated junk. Even the real passwords are probably stolen from other leaks.
The reason is obvious: No honor among thieves. Salted hashes are worthless, cracking hardware is expensive, so they simply hallucinate some passwords and sell it as a fully cracked credentials leak.
Someone buys a faked or artificially inflated leak, finds out that none of those credentials work, and burns it (makes it public). No one would waste a credentials database that is actually working, so only the junk ends up in those free and popular compilations. The bigger ones contain more 'passwords' than there are people on the planet. The bigger the better, apparently.
And the security industry/media? Fearmongering and hyping the 'largest leak ever' as usual.
Fun fact: Sorting a password list before compression reduces its size from 4.9GB to less than 1.3GB. Same compression parameters, just sorted vs. unsorted. But 4.9GB sounds more important than 1.3GB and in a sorted list you can easily see the patterns of the computer-generated fake passwords, so why would a seller do that?