It need not be meaningful words - more like random password generation, but the catch is - they should be unique. I will be using this for some kind of package / product cod
this is my favorite way to do it.
$pretrimmedrandom = md5(uniqid(mt_rand(),true));
$trimmed = substr($pretrimmedrandom ,0,7);
uniqid uses the current time to generate a very unique random string. results look like "3f456yg".
try this
echo $unique_key = substr(md5(rand(0, 1000000)), 0, 5);
it will give string with length 5.
It is generally not possible to generate sequences with both unique and random elements: obviously to be unique the algorithm has to take into account the previously generated elements in the sequence, so the next ones will not really be random.
Therefore your best bet would be to detect collisions and just retry (which could be very expensive in your particular case).
If you are constrained to just 7 chars, there's not much you can do above:
$allowed_chars = 'abcdefghijklmnopqrstuvwxz';
$allowed_count = strlen($allowed_chars);
$password = null;
$password_length = 7;
while($password === null || already_exists($password)) {
$password = '';
for($i = 0; $i < $password_length; ++$i) {
$password .= $allowed_chars{mt_rand(0, $allowed_count - 1)};
}
}
This should eventually give you a new password.
However, in similar cases I have encountered I usually pick a larger password size which also happens to be the size of the hex representation of a popular hash function (e.g. md5
). Then you can make it easier on yourself and less error prone:
$password = time(); // even better if you have some other "random" input to use here
do {
$password = md5(time().$password);
}
while (already_exists($password));
This also has the added advantage that the sequence space is larger, hence there will be less collisions. You can pick the size of the hash function according to the expected numbers of passwords you will generate in the future to "guarantee" a low collision probability and thus less calls to the possibly expensive already_exists
function.
A random alphanumeric (base 36 = 0..9 + a..z
) value that has 7 chars has to have a base 10 representation between 2176782336
and 78364164095
, the following snippet proves it:
var_dump(base_convert('1000000', 36, 10)); // 2176782336
var_dump(base_convert('zzzzzzz', 36, 10)); // 78364164095
In order for it to be unique we have to rely on a non-repeating factor, the obvious choice is time()
:
var_dump(time()); // 1273508728
var_dump(microtime(true)); // 1273508728.2883
If we only wanted to ensure a minimum uniqueness factor of 1 unique code per second we could do:
var_dump(base_convert(time() * 2, 10, 36)); // 164ff8w
var_dump(base_convert(time() * 2 + 1, 10, 36)); // 164ff8x
var_dump(base_convert(time() * 2 + 2, 10, 36)); // 164ff8y
var_dump(base_convert(time() * 2 + 3, 10, 36)); // 164ff8z
You'll notice that these codes aren't random, you'll also notice that time()
(1273508728
) is less than 2176782336
(the minimum base 10 representation of a 7 char code), that's why I do time() * 2
.
Now lets do some date math in order to add randomness and increase the uniqueness factor while complying with the integer limitations of older versions of PHP (< 5.0
?):
var_dump(1 * 60 * 60); // 3600
var_dump(1 * 60 * 60 * 24); // 86400
var_dump(1 * 60 * 60 * 24 * 366); // 31622400
var_dump(1 * 60 * 60 * 24 * 366 * 10); // 316224000
var_dump(1 * 60 * 60 * 24 * 366 * 20); // 632448000
var_dump(1 * 60 * 60 * 24 * 366 * 30); // 948672000
var_dump(1 * 60 * 60 * 24 * 366 * 31); // 980294400
var_dump(PHP_INT_MAX); // 2147483647
Regarding PHP_INT_MAX
I'm not sure what exactly changed in recent versions of PHP because the following clearly works in PHP 5.3.1, maybe someone could shed some light into this:
var_dump(base_convert(PHP_INT_MAX, 10, 36)); // zik0zj
var_dump(base_convert(PHP_INT_MAX + 1, 10, 36)); // zik0zk
var_dump(base_convert(PHP_INT_MAX + 2, 10, 36)); // zik0zl
var_dump(base_convert(PHP_INT_MAX * 2, 10, 36)); // 1z141z2
var_dump(base_convert(PHP_INT_MAX * 2 + 1, 10, 36)); // 1z141z3
var_dump(base_convert(PHP_INT_MAX * 2 + 2, 10, 36)); // 1z141z4
I got kinda lost with my rationalization here and I'm bored so I'll just finish really quick. We can use pretty much the whole base 36 charset and safely generate sequential codes with a minimum guaranteed uniqueness factor of 1 unique code per second for 3.16887646 years using this:
base_convert(mt_rand(22, 782) . substr(time(), 2), 10, 36);
I just realized that the above can sometimes return duplicated values due to the first argument of mt_rand()
, in order to produce unique results we need to limit a our base 36 charset a little bit:
base_convert(mt_rand(122, 782) . substr(time(), 2), 10, 36);
Remember that the above values are still sequential, in order to make them look random we can use microtime()
but we can only ensure a uniqueness factor of 10 codes per second for 3.8 months:
base_convert(mt_rand(122, 782) . substr(number_format(microtime(true), 1, '', ''), 3), 10, 36);
This proved to be more difficult than I originally antecipated since there are lot of constrains:
If we can ignore any of the above it would be a lot easier and I'm sure this can be further optimized but like I said: this is boring me. Maybe someone would like to pick this up where I left. =) I'm hungry! =S
Here is something that looks random and should be unique and have 7 chars for the times to come:
echo base_convert(intval(microtime(true) * 10000), 10, 36);
Or for a little more randomness and less uniqueness (between 1000 and 10000 per second):
echo base_convert(mt_rand(1, 9) . intval(microtime(true) * 1000), 10, 36);
Or (uniqueness between 100 and 10000 per second) - this is probably the best option:
echo base_convert(mt_rand(10, 99) . intval(microtime(true) * 100), 10, 36);
Or (uniqueness between 10 and 10000 per second):
echo base_convert(mt_rand(100, 999) . intval(microtime(true) * 10), 10, 36);
You get the idea.
+1 to @Michael Haren's comment. If the passwords on your site should not have a constraint to be unique.
If I try to use a given password and I get an error that I can't use it because it's already in use, then I know some user on the system has that password. If there are 1000 users I only need to try a max of 1000 other accounts before I find the one who has that password.
Not really answering your question, but more than a comment. So I'm marking this CW.