Git Product home page Git Product logo

Comments (26)

r2evans avatar r2evans commented on May 20, 2024 1

Both are good ideas. I also found rroemhild/docker-test-openldap which provides half of what I was looking for, and I think I can use one of the docker hub php images (partly because it comes prepopulated with data, that much simpler). Between your suggestions and this, I think I can get something working.

from self-service-password.

coudot avatar coudot commented on May 20, 2024 1

See also https://github.com/bjeavons/zxcvbn-php

from self-service-password.

SyBernot avatar SyBernot commented on May 20, 2024 1

You can get a list of about a half billion known compromised passwords from here:
https://haveibeenpwned.com/Passwords
or you could just hook into his API, personally I don't trust anyone so doing it locally seems more secure even if you are only sending a hash.

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

(Caveat: I'm not a dev with LTB.) Are you talking about a simple blacklist? It sounds like you mean substring matches, meaning !password123! should fail, since either password or (at least) password123 should not be a substring.

On a cursory search, there are API-based subscription services (such as https://www.passwordrbl.com). I think this is different than what you are suggesting (and considerably different to implement).

I assume you mean a "static" blacklist ("static" in that its updates are either manual (by the sysadmin) or manual/versioned (by the maintainers of LTB-SSP). There are several known lists (such as http://web.archive.org/web/20130605082350/http://www.isdpodcast.com:80/resources/62k-common-passwords/, among many), so finding a reliable and reasonable source should just take a little time to research. I don't have much experience looking for them, so that link is just an example; I can't speak to it.

Some discussion questions:

  • should this be case-insensitive?
  • should it look for (simple) 1337-speak transliteration?
  • what about looking for reversals? (321wrodssap)
  • if substring matching is done, is there a point at which leading/trailing characters make it alright?
    (difference between aaPASSWORD123aa and &13as!&8sd@@PASSWORD123@@f7y*sd)

Obviously the combinations of all this needs to be considered for performance: a great password-matching blacklist does nothing if it takes too long to process and the page times out.

I suggest a first cut on this would be literal string matching, with later extensions to include substrings (with padding limits) and reversals.

What do you think, @coudot, do you find utility in having a blacklist? (I am confident that a lot of my authenticating-users will curse at me for enabling this :-)

@aurbjo, if he likes the idea (and doesn't have a starting point), it might help if you found an acceptable starting point for the list. Think: freely distributed, possibly updated (not antiquated), small enough, and easily parsed (perhaps just a literal text file, one-per-line).

from self-service-password.

coudot avatar coudot commented on May 20, 2024

Yes, we could indeed use a simple text file as a blacklist, this file could then be updated by any means outside SSP. We can provide a simple blacklist at installation.

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

As a thought, would it be a bit robust to use something like cracklib for checking more than just a dictionary? It supports more than just blacklists, though I admit I don't know how to configure its various checks.

from self-service-password.

aurbjo avatar aurbjo commented on May 20, 2024

In my case i am just thinking about a simple list like @coudot is suggesting, this would deny users setting passwords like "qwerty12345", "password123" and so on. I also see users that have their company name and the current year in their password and it's not a good practice.

from self-service-password.

coudot avatar coudot commented on May 20, 2024

Cracklib usage can be another feature.

By the way, the support of cracklib in PHP seems experimental: http://php.net/manual/en/ref.crack.php

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

(Essentially the same link I posted earlier.) Yes, and furthermore it was removed from base-php (around php-5.0) and pushed to be an extension, so that says that it is not something everyone will need but it has been in the ecosystem for many php versions.

I'll counter with this: I recommend against reinventing the wheel. Even if it has more features than originally suggested, php-crack has fast lookups and attempts to address the performance issues that large lists will bring. As a bonus, it can comment on password-strength (admittedly a contentious and subjective metric). The biggest detriment in my mind is that it isn't a flat text file (it needs to be compiled into the files, as commented on the link you provided), so in that vein it has a little more required investment on the part of the user.

If you think php-crack is too much (for now), I suggest finding another already-existing php package that permits fast lookups in a text file. There's no need to reinvent "lookups" (and therefore less work for coudot to test and maintain).

from self-service-password.

coudot avatar coudot commented on May 20, 2024

Indeed, sorry, I did not see the previous link.

If some of you want to propose some code, I'll be happy to review it.

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

@coudot, do you have a sample LDAP image (e.g., docker) for testing? It's much easier when testing on something other than a production environment.

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

Hehe, @coudot, I guess I didn't read all of the page I originally provided ... last "update" to php-crack was in 2005. Doesn't really inspire confidence ...

So perhaps the wheel can be reinvented a little here, simplistically.

Some thoughts, let me know if you have reservations:

  • Using $out = system("grep -q ...", $retval). My primary motivation is efficiency, since I know of no methods that will be able to do yes/no pattern "a match exists" as quickly as grep and family. Using -q means that it can short-circuit: as soon as a match is found, it returns with no other searching, perhaps a slight efficiency. (This also means we need to look at $retval instead of $out.)
  • Considering allowing compressed lists, thereby requiring zgrep and bzgrep; this can be determined by filename and runtime.
  • In order to protect against code-injection and/or command-injection when using system(), I'm thinking it may be safer to write the new password into a temp file and use grep -q -F -f ptnfile dictfile (the -F is to ensure it is used as a literal string so that regex chars aren't mistakenly taken as patterns).

from self-service-password.

plewin avatar plewin commented on May 20, 2024

If the goal is to implement a blacklist of passwords (and not a blacklist of words) in efficient way I would suggest filtering out the lines not conforming to the password policy, save the list sorted in a cache file, then read the cache file and find if the password is present using a binary search.

For example banning all passwords of this list of "one million of the most common passwords" ~ 9 Mio uncompressed

  1. load the whole file with file() (< 70 Mio RAM usage in php 7)
  2. filter out all passwords not conforming to password policy
  3. sort the array (can be already sorted)
  4. serialize() (serialized string < 24 Mio RAM in php 7 if nothing is filtered out)
  5. dump to file blacklistcache.txt

a) read and unserialize blacklistcache.txt (I'm not sure if file_get_contents + unserialize is better than a file() EDIT: no it's not, see benchmark below);
b) perform the binary search with a function adapted/similar to this one

edit: renamed steps
edit2: added "no it's not, see benchmark below"

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

@plewin, you suggest doing all of that for each password change? I haven't benchmarked it, but that seems a bit extraneous. Perhaps the first five steps could be avoided since the program is already checking if the new password meets the password policy (otherwise no dictionary lookup is required).

I understand the suggestion to use a binary search; assuming that the file is sorted (which, given your first five, would be assured), this would most certainly be faster. However, are you sure even this will be faster than using an external grep?

Back to your first point: the OP did request blacklisting passwords; I wonder if it makes sense to be that literal, since defeating it is as simple as adding one character to the password (from password123 to password123! might work).

Thanks! I'll see if I can muster up some benchmarking between external grep and an in-memory binary search in PHP.

from self-service-password.

plewin avatar plewin commented on May 20, 2024

@r2evans Steps 1, 2, 3, 4, 5 are done only one time to compile the cache. It will require a script that can be launched manually or automatically if we detect that the cache is older than the dictionnary. Steps a) b) must be done each time there is a password change and only after the new password is valid according to the password policy.

With very basic tests on my side it looks like the binary search idea is not a such good idea after all. It's better only if the dictionnary is already in ram or the call of the file() to load the dictionnary in php is faster than spawning grep with exec.

Is it okay to add a dependency on a external command ? exec isn't portable.

IMHO it's okay if "password123!" passes the blacklist with "password123". An attacker would use the dictionnary, if "password123!" isn't in a dictionnary of the N most common passwords it should be ok to use it.

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

I understand your description of when things are done, and it's certainly feasible to "invalidate the cache" and regenerate it as needed. I hope it isn't a time-intensive thing, though this can be mitigated several ways. I'm not worried about this part of it.

Your question of external dependencies is appropriate, but I think having grep available is not too much to ask. Even on "alpine", {,e,f}grep is provided by busybox, and that supports all of the options I was considering (-q, -f, and -F). The only problem is on a windows server, in which case there needs to be a note that something like Git-for-Windows or GnuWin32 is needed to get this extension to work. (Note my emphasis: this is an extension to SSP, not a core function, though some may feel it is important enough to be considered "core".)

Your last comment, though correct, is based on the assumption that the attackers are using the same dictionary. This is great justification for something like cracklib, where the strength of the password is tested, including presence of dictionary words. This is a different tact than a password blacklist, and I think the two are very different mindsets (not to mention implementations). So for a "blacklist", you are right, it should be accepted or explicitly added to the blacklist. (Anybody volunteer to maintain php-crack, an extension that hasn't been touched in 11 years?) For this discussion, I'll lean towards "whole-line filtering", a literal blacklist instead of a quasi-pattern blacklist.

from self-service-password.

plewin avatar plewin commented on May 20, 2024

I made a little benchmark comparing several possibilities on my ubuntu linux, php 7, i5-2430M CPU @ 2.40GHz + ssd.

file + in_array is the fastest until around 100 000 passwords
At 1 000 000 passwords, grep is faster than file + in_array in a factor of ~4
At ~3 700 000 grep is faster in a factor of ~ 10 ?

Traversing the file of 3 700 000 passwords took grep 37 ms (with no options), file + in_array 381 ms, stream_get_line 773 ms.
Problem with in_array : its consumes at lot of memory, for the last file php peak memory usage was 276 Mio.
stream_get_line is the best alternative to grep and file+in_array but there is a drawback : line endings must be fixed.

Results
Code

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

Not sure you mean, "line endings must be fixed"?

Nice benchmarking, thanks for that. The response times (for other than grep) are better than I expected, I'm assuming that difference is due to a fork overhead (?). I agree that, though not likely to trigger an OOM error, 276 Mio is a bit onerous.

from self-service-password.

coudot avatar coudot commented on May 20, 2024

Thanks a lot for this work. Seems we can choose to use grep with escapeshell functions in order to have some security when invoking the command. And this feature will be disabled by default.

We will then add documentation to show how download a password list and use it in SSP.

from self-service-password.

plewin avatar plewin commented on May 20, 2024

@r2evans Sorry for the confusing words. stream_get_line requires that all line endings are the same and coherent in the file. You must be sure of lines endings (\n or \r\n) and give it as an argument of stream_get_line overwise trimming will be required and there will be a performance penalty like in the fgets vs fgets_rtrim benchmark.

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

@plewin, thanks, that's what I suspected.

@coudot: absolutely, disabled by default. Any feedback on a testing image/environment?

from self-service-password.

coudot avatar coudot commented on May 20, 2024

@r2evans there is no docker image but it should be easy to build one. Anyway you can use any virtualization tool and install SSP with packages. You need a simple LDAP directory with a user account to test the password change.

from self-service-password.

plewin avatar plewin commented on May 20, 2024

@r2evans if you need a ldap directory for development you can use the embedded ldap server in Apache Directory Studio like in this tutorial http://www.stefan-seelmann.de/blog/setting-up-an-ldap-server-for-your-development-environment

from self-service-password.

coudot avatar coudot commented on May 20, 2024

Someone told me about passwqc, and here is an PHP implementation: https://github.com/helver/PHP_passwdqc_check/blob/master/PHP_passwdqc_check/PasswordStrengthTest.inc

They provide a 4k words list and use PHP array methods to check it. Not sure it's the best way but it is interesting to read their code.

from self-service-password.

plewin avatar plewin commented on May 20, 2024

I like this repository of passwords, it is from OWASP

https://github.com/danielmiessler/SecLists/tree/master/Passwords

Note for my 2.0 proposal:
I have implemented the old password strength checker + a dictionary checker with grep + Zxcvbn.
Multiple can be used in the same time and one can add a class to implement a new one.
https://github.com/plewin/self-service-password/tree/575c884d8bcd0022ed0f304da2307c509ea03c7a/src/PasswordStrengthChecker

from self-service-password.

r2evans avatar r2evans commented on May 20, 2024

I second @SyBernot's mention of Have I Been Pwned, both as an API and as an optional local-copy. Though I don't think I would download it (at least not initially), I can definitely see utility in supporting both mechanisms. NextCloud just adopted its "range" API function, in which the caller sends only the first 5 chars of the SHA1 hash, and the API sends back all matching hashes and their respective counts. I'm not entirely certain how much security it adds, but the frequency of use and bandwidth overhead don't seem to be too high.

from self-service-password.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.