Related paper: Two Birds with One Stone: Two-Factor Authentication with Security Beyond Conventional Bound Main.pdf Appendix.pdf
Ethics considerations: Though the two password datasets (32.58 million Rockyou, 6.43 million CSDN) herein have been publicly disclosed and collectively used by a number of scientific works that study passwords (see the proceedings of IEEE S&P 2012, ACM CCS 2013, ISOC NDSS 2014, etc.), this nevertheless creates an ethical conundrum: Should our research use passwords leaked publicly? Since this data has already been made public and is easily available, using it in our research does not increase the harm to the victims. We use these passwords (without any identifiable information such as user name, email) only for scientific use (e.g., train and test guessing algorithms to evaluate attackers' capabilities.). Furthermore, as attackers are likely to use these password sets as training sets or cracking dictionaries, our use of them to evaluate password strength implies our results are more likely to be of practical relevance to security administrators and common users.
Fuzzy Verifier: To overcome the dilemma of maintaining usability while achieving two-factor security even the smart card can be tampered, here we propose a tradeoff between security and usability: Instead of storing {h(PWi)} in the card memory, we store {h(PWi) mod N}, where PWi stands for user Ui's password, h() is a Hash function (e.g., SHA-1 in the following experiments), and N the balance factor (a moderate integer, e.g. N=256 ). Then, user Ui can change her password locally, yet this user-friendliness comes at the prices of security: the chances of an adversary to guess Ui's password is roughly 256t/D, where D is the password space from which each user's password is uniformed drawn and t is the number of adversary's online attempts. Generally, D is moderately limited, e.g., |D|=1000, 000 [1,2]. That's to say, the modified protocol using this new method is able to meet the Level 1 security of NIST SP800-63-2 [3] even if the smart card has been compromised, which guarantees that the success probability of one online guessing attempt by the adversary is no more than 1/1024. As is well known, online guessing can be effectively thwarted by locking the account after several failed login attempts or by only allowing a probabilistic number of failed guesses [9]. In two other cases, the security is improved: (1) the chances of an adversary to change Ui's password is 1/256; (2) the chances of Ui to accidentally make the card unusable is 1/256 when inputs an unintended new password in the password change phase, reduced to 1/65536 if users are required to type the new password twice.
In the above analysis, the value of 256t/D is obtained on the basis of the assumption that, user passwords are uniformly distributed. However, in practice, this is generally not the case——user passwords have strong bias towards similarity! This can be evidenced by the following statistics. For example, the top 20 most popular passwords from Rockyou dataset [4] account for 2.54% of the entire set. To evaluate the practical effectiveness of our proposed ``fuzzy verifier'' when user password space is greatly biased, we use four real-life publicly available password datasets (i.e., top 1 million most popular passwords from 32 million Rockyou dataset, top 2 million most popular passwords from 32 million Rockyou dataset, top 1 million most popular passwords from 6.43 million CSDN dataset [5], top 1 million most popular passwords from 6.43 million CSDN) to carry out the following two explorations:
(1) How many passwords in D fall into the same password pool with PWi? As a complement, we say PWj falls into the same pool with PWi only if the value of h(PWj) mod 256 equals h(PWi) mod 256. We say the resulting password pool is POOLi.
(2) How many password finally remained if we remove these passwords from POOLi that are unlikely to be Ui's password PWi? Without any specific information about Ui's personal information (e.g., hobbies, name, birthday), it is really difficult (probably impossible) to accurately define what's the character(s) that are unlikely to be hold by Ui's password PWi. As a result, we can only use the statistical information of POOLi to detect whether there is any abnormality. An effective metric is the expected number of guesses required to find any password in POOLi if the attacker proceeds an optimal online attack (i.e., testing the most likely passwords first), known as guesswork or guessing entropy [2,6] .
Admittedly, we have obtained two large dataset with user ID and password, yet such information is definitively sensitive, and may cause subtle sufferings to victims if such dataset are illustrated publicly, for users tend to reuse their IDs and passwords [7,8]. As far as we know, using guessing entropy to characterize a password dataset is currently the best strategy that can be adopted while corresponding user-specific information is unavailable (or cannot be appropriately used). Assume password pool POOLi includes x entries. Each of the entry is of the <Password, Count, Popularity> form, see POOL0 of the top 2 million CSDN dataset 0.txt. Note that, the term "Count" stands for the popular count of the corresponding password. For example, if the password "shanshan" occurs 210 times in the entire 6 million CSDN dataset, we say the popular count of password "shanshan" is 210. The "Popularity" of password "shanshan" is 210/(total count in the current pool=210/13262=0.0158347. For simplicity, we denote the popularities of each password pool in decreasing order P1, P2, P3, ...... Px.
Accordingly, we can determine the guessing entropy [2] of POOL0 by computing E=1 * P1 + 2 * P2 + 3 * P3 + ....+ x * Px = 2492.49, which is larger than 1024. This means POOL0 is a valid pool, and the success probability of one online guessing attempt using password candidates in POOL0 by the adversary is 1/2492.49, no more than 1/1024.
Our empirical results demonstrate that the distribution bias of password space D does not significantly degrade our proposed method, and it ensures the Level 1 security of NIST SP800-63-2.
In the above analysis, the value of 256t/D is obtained on the basis of the assumption that, user passwords are uniformly distributed. However, in practice, this is generally not the case——user passwords have strong bias towards similarity! This can be evidenced by the following statistics. For example, the top 20 most popular passwords from Rockyou dataset [4] account for 2.54% of the entire set. To evaluate the practical effectiveness of our proposed ``fuzzy verifier'' when user password space is greatly biased, we use four real-life publicly available password datasets (i.e., top 1 million most popular passwords from 32 million Rockyou dataset, top 2 million most popular passwords from 32 million Rockyou dataset, top 1 million most popular passwords from 6.43 million CSDN dataset [5], top 1 million most popular passwords from 6.43 million CSDN) to carry out the following two explorations:
(1) How many passwords in D fall into the same password pool with PWi? As a complement, we say PWj falls into the same pool with PWi only if the value of h(PWj) mod 256 equals h(PWi) mod 256. We say the resulting password pool is POOLi.
(2) How many password finally remained if we remove these passwords from POOLi that are unlikely to be Ui's password PWi? Without any specific information about Ui's personal information (e.g., hobbies, name, birthday), it is really difficult (probably impossible) to accurately define what's the character(s) that are unlikely to be hold by Ui's password PWi. As a result, we can only use the statistical information of POOLi to detect whether there is any abnormality. An effective metric is the expected number of guesses required to find any password in POOLi if the attacker proceeds an optimal online attack (i.e., testing the most likely passwords first), known as guesswork or guessing entropy [2,6] .
Admittedly, we have obtained two large dataset with user ID and password, yet such information is definitively sensitive, and may cause subtle sufferings to victims if such dataset are illustrated publicly, for users tend to reuse their IDs and passwords [7,8]. As far as we know, using guessing entropy to characterize a password dataset is currently the best strategy that can be adopted while corresponding user-specific information is unavailable (or cannot be appropriately used). Assume password pool POOLi includes x entries. Each of the entry is of the <Password, Count, Popularity> form, see POOL0 of the top 2 million CSDN dataset 0.txt. Note that, the term "Count" stands for the popular count of the corresponding password. For example, if the password "shanshan" occurs 210 times in the entire 6 million CSDN dataset, we say the popular count of password "shanshan" is 210. The "Popularity" of password "shanshan" is 210/(total count in the current pool=210/13262=0.0158347. For simplicity, we denote the popularities of each password pool in decreasing order P1, P2, P3, ...... Px.
Accordingly, we can determine the guessing entropy [2] of POOL0 by computing E=1 * P1 + 2 * P2 + 3 * P3 + ....+ x * Px = 2492.49, which is larger than 1024. This means POOL0 is a valid pool, and the success probability of one online guessing attempt using password candidates in POOL0 by the adversary is 1/2492.49, no more than 1/1024.
Our empirical results demonstrate that the distribution bias of password space D does not significantly degrade our proposed method, and it ensures the Level 1 security of NIST SP800-63-2.
Experimental 1. Results from top 1 million most popular passwords of Rockyou dataset
Note: Since h(123456) mod 256 =(7C4A8D09CA3762AF61E59520943DC26494F8941B) mod 256 = 27, the password pool of 123456 is denoted by 27.txt
Rank
(top 20) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Guessing entropy computed from top 1 million most popular passwords of Rockyou dataset
(1) The minimum password number of the 256 password pools is 3739, and the maximum is 4141;
(2) The minimum guessing expectation (entropy) of the 256 password pools is 111.300, and the maximum is 750.603, which fails to guarantee that the success probability of one online guessing attempt using password candidate in any pool by the adversary is no more than 1/1024.
(3) Due to storage space constraints of this site, we only upload 10% of the password pool files.
(2) The minimum guessing expectation (entropy) of the 256 password pools is 111.300, and the maximum is 750.603, which fails to guarantee that the success probability of one online guessing attempt using password candidate in any pool by the adversary is no more than 1/1024.
(3) Due to storage space constraints of this site, we only upload 10% of the password pool files.
|
|
Experimental 2. Results from the top 1 million most popular passwords of CSDN dataset
(1) The minimum password number of the 256 password pools is 3756, and the maximum is 4079;
(2) The minimum guessing expectation of the 256 password pools is 39.799, the maximum is 1122.18, and the majority is below 1024, which fails to guarantee that the success probability of one online guessing attempt using password candidate in any pool by the adversary is no more than 1/1024.
(3) Due to storage space constraints of this site, we only upload 10% of the password pool files.
|
|
Experimental 3. Results from the top 2 million most popular passwords of Rockyou dataset
(1) The minimum password number of the 256 password pools is 7612, and the maximum is 8018;
(2) The minimum guessing expectation of the 256 password pools is 247.935, the maximum is 1418.31, and about 16% of the pools are below 1024, which fails to guarantee that the success probability of one online guessing attempt using password candidate in any pool by the adversary is no more than 1/1024.
(3) Due to storage space constraints of this site, we only upload 10% of the password pool files.
(2) The minimum guessing expectation of the 256 password pools is 247.935, the maximum is 1418.31, and about 16% of the pools are below 1024, which fails to guarantee that the success probability of one online guessing attempt using password candidate in any pool by the adversary is no more than 1/1024.
(3) Due to storage space constraints of this site, we only upload 10% of the password pool files.
|
|
Experimental 4. Results from the top 2 million most popular passwords of CSDN dataset (Satisfactory results)
(1) The minimum password number of the 256 password pools is 7584, and the maximum is 8146;
(2) The minimum guessing expectation of the 256 password pools is 127.743, the maximum is 2600.56, and only about 2.34% of the pools are below 1024. More specifically, six pools are "bad", i.e. POOL65 (127.743), POOL13(149.431), POOL184(343.376), POOL180(528.102), POOL139(652.968), POOL146(973.749). This means that, with an overwhelming probability, a pool selected from these 256 pools can guarantee a Level 1 of security. We further note that, if we eliminate only one most highly vulnerable password from each of these six bad pools (e.g., 123456 from POOL65, 123456789 from POOL65), then all the guess entropys of 256 password pools will have a guessing expectation larger than 1024. In practice, this can be achieved by using a blacklist and preventing the users from choosing these highly vulnerable passwords (e.g., 123456, 123456789).
(3) Due to storage space constraints of this site, we only upload 10% of the password pool files.
(2) The minimum guessing expectation of the 256 password pools is 127.743, the maximum is 2600.56, and only about 2.34% of the pools are below 1024. More specifically, six pools are "bad", i.e. POOL65 (127.743), POOL13(149.431), POOL184(343.376), POOL180(528.102), POOL139(652.968), POOL146(973.749). This means that, with an overwhelming probability, a pool selected from these 256 pools can guarantee a Level 1 of security. We further note that, if we eliminate only one most highly vulnerable password from each of these six bad pools (e.g., 123456 from POOL65, 123456789 from POOL65), then all the guess entropys of 256 password pools will have a guessing expectation larger than 1024. In practice, this can be achieved by using a blacklist and preventing the users from choosing these highly vulnerable passwords (e.g., 123456, 123456789).
(3) Due to storage space constraints of this site, we only upload 10% of the password pool files.
|
|
5. Subtleties revealed from the above four experiments
In the above four experiments, only top 2 million CSDN dataset can guarantee that every guessing entropy of its pools is larger than 1024, while the top 1 million CSDN dataset is unable to provide such a security level. This implies that, to provide the expected level of security, the underlying password space of these user selected passwords shall be large enough. In addition, top 2 million CSDN dataset is desired, while top 2 million RockYou dataset is insufficient. This implies that, besides a requirement for search space size of the passwords, there is also a requirement for the strength of individual passwords. The difference in password strength between these two password dataset is mainly caused by the varied password creation policies. More specifically, Rockyou only enforces that a user selected password is of length no less than 5 characters, while CSDN requires that user passwords are at least 6 characters long. Another contributing factor is that Rockyou passwords are used to protect social networking accounts and its users are mainly non-professionals, while CSDN passwords are used to protect personal technical notes and blogs and its users are mainly programmers.
After all, we have shown that, there do exist a real-life password dataset (i.e., the CSDN dataset) that satisfies a specified level of secuity, which further demonstrates the feasibility of our "Fuzzy Verifier".
After all, we have shown that, there do exist a real-life password dataset (i.e., the CSDN dataset) that satisfies a specified level of secuity, which further demonstrates the feasibility of our "Fuzzy Verifier".
6. Acknowledgements
We would like to acknowledge the contributions that Dr. Chen Zhu and Qianchen Gu made in the development of the password statistics software tool used in the analysis of password dataset in this paper. We are also grateful to Dr. Haibo Chen for some constructive suggestions.
7. References
[1] M. Dell’Amico, P. Michiardi, and Y. Roudier, “Password strength: an empirical analysis,” in Proc. INFOCOM 2010. IEEE, 2010, pp. 1–9.
[2] J. Bonneau, “The science of guessing: Analyzing an anonymized corpus of 70 million passwords,” in Proc. IEEE S&P 2012. IEEE Computer Society, 2012, pp. 538–552.
[3] W. Burr, D. Dodson, R. Perlner, W. Polk, S. Gupta and E. Nabbus: NIST SP800-63-2 – electronic authentication guideline. Tech. rep., National Institute of Standards and Technology, Reston, VA (August 2013), doi:http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-63-2.pdf
[4] C. Allan, 32 million Rockyou passwords stolen, available at http://www.hardwareheaven.com/news.php?newsid=526 (Dec. 2009).
[5] Dazzlepod Inc., CSDN 6 million cleartext passwords, online news, Available at http://dazzlepod.com/csdn/ (Mar. 2013).
[6] J. L. Massey, “Guessing and Entropy,” in Proceedings of the 1994 IEEE International Symposium on Information Theory, 1994, p. 204-208.
[7] D., Anupam, J. Bonneau, M. Caesar, A. Nikita and X.F. Wang. Tangled Web of Password Reuse. Proc. NDSS 2014, San Diego, CA, USA, 23-26 February 2014.
[8] D. Florencio, C. Herley, A large-scale study of web password habits, in: Proceedings of WWW 2007, ACM, 2007, pp. 657–666.
[9] M. Alsaleh, M. Mannan, and P. Van Oorschot, “Revisiting defenses against large-scale online password guessing attacks,” IEEE Trans. Depend. Secur. Comput., vol. 9, no. 1, pp. 128–141, 2012.
[2] J. Bonneau, “The science of guessing: Analyzing an anonymized corpus of 70 million passwords,” in Proc. IEEE S&P 2012. IEEE Computer Society, 2012, pp. 538–552.
[3] W. Burr, D. Dodson, R. Perlner, W. Polk, S. Gupta and E. Nabbus: NIST SP800-63-2 – electronic authentication guideline. Tech. rep., National Institute of Standards and Technology, Reston, VA (August 2013), doi:http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-63-2.pdf
[4] C. Allan, 32 million Rockyou passwords stolen, available at http://www.hardwareheaven.com/news.php?newsid=526 (Dec. 2009).
[5] Dazzlepod Inc., CSDN 6 million cleartext passwords, online news, Available at http://dazzlepod.com/csdn/ (Mar. 2013).
[6] J. L. Massey, “Guessing and Entropy,” in Proceedings of the 1994 IEEE International Symposium on Information Theory, 1994, p. 204-208.
[7] D., Anupam, J. Bonneau, M. Caesar, A. Nikita and X.F. Wang. Tangled Web of Password Reuse. Proc. NDSS 2014, San Diego, CA, USA, 23-26 February 2014.
[8] D. Florencio, C. Herley, A large-scale study of web password habits, in: Proceedings of WWW 2007, ACM, 2007, pp. 657–666.
[9] M. Alsaleh, M. Mannan, and P. Van Oorschot, “Revisiting defenses against large-scale online password guessing attacks,” IEEE Trans. Depend. Secur. Comput., vol. 9, no. 1, pp. 128–141, 2012.