I have on PHP array, for example:
$arr = array(\"hello\", \"try\", \"hel\", \"hey hello\");
Now I want to do rearrange of the array which w
While @yceruto's answer is correct and informative, I would like to extend additional insights and demonstrate more modern implementation syntax.
<=> from PHP7+First about the generated scores from respective functions...
levenshtein() and similar_text() ARE case-sensitive so an uppercase H is just as much a mismatch as the number 6 when compared to h.levenshtein() and similar_text() ARE NOT multi-byte aware so an accented character like ê will not only be deemed a mismatch for e, it will potentially receive a heavier penalty based on each individual byte being a mismatch.If you want to make case-insensitive comparisons, you can simply convert both strings to uppercase/lowercase before executing.
If your application requires multi-byte support, you should search for existing repositories that provide this functionality.
Additional techniques for those willing to research more deeply include metaphone() and soundex(), but I will not delve into these topics in this answer.
Scores:
Test vs "hello" | levenshtein | similar_text | similar_text's percent |
----------------+----------------+----------------+----------------------------|
H3||0 | 5 | 0 | 0 |
Hallo | 2 | 3 | 60 |
aloha | 5 | 2 | 40 |
h | 4 | 1 | 33.333333333333 |
hallo | 1 | 4 | 80 |
hallå | 3 | 3 | 54.545454545455 |
hel | 2 | 3 | 75 |
helicopter | 6 | 4 | 53.333333333333 |
hellacious | 5 | 5 | 66.666666666667 |
hello | 0 | 5 | 100 |
hello y'all | 6 | 5 | 62.5 |
hello yall | 5 | 5 | 66.666666666667 |
helów | 3 | 3 | 54.545454545455 |
hey hello | 4 | 5 | 71.428571428571 |
hola | 3 | 2 | 44.444444444444 |
hêllo | 2 | 4 | 72.727272727273 |
mellow yellow | 9 | 4 | 44.444444444444 |
try | 5 | 0 | 0 |
Sort by levenshtein() PHP7+ (Demo)
usort($testStrings, function($a, $b) use ($needle) {
return levenshtein($needle, $a) <=> levenshtein($needle, $b);
});
Sort by levenshtein() PHP7.4+ (Demo)
usort($testStrings, fn($a, $b) => levenshtein($needle, $a) <=> levenshtein($needle, $b));
Notice that $a and $b have changed sides of the <=> evaluation for DESC ordering.
**Notice that hello is not assured to be positioned as first element
Sort by similar_text() PHP7+ (Demo)
usort($testStrings, function($a, $b) use ($needle) {
return similar_text($needle, $b) <=> similar_text($needle, $a);
});
Sort by similar_text() PHP7.4+ (Demo)
usort($testStrings, fn($a, $b) => similar_text($needle, $b) <=> similar_text($needle, $a));
Notice the difference in scoring of hallå and helicopter via similar_text()'s return value versus similar_text()'s percent value.
Sort by similar_text()'s percent PHP7+ (Demo)
usort($testStrings, function($a, $b) use ($needle) {
similar_text($needle, $a, $percentA);
similar_text($needle, $b, $percentB);
return $percentB <=> $percentA;
});
Sort by similar_text()'s percent PHP7.4+ (Demo)
usort($testStrings, fn($a, $b) =>
[is_int(similar_text($needle, $b, $percentB)), $percentB]
<=>
[is_int(similar_text($needle, $a, $percentA)), $percentA]
);
Notice that I am neutralizing the unwanted return value of similar_text() by converting its return value to true, then using the generated percent value -- this allows the generation of the percent value without returning too soon since arrow function syntax does not permit multi-line execution.
Sort by levenshtein() then break ties with similar_text() PHP7+ (Demo)
usort($testStrings, function($a, $b) use ($needle) {
return [levenshtein($needle, $a), similar_text($needle, $b)]
<=>
[levenshtein($needle, $b), similar_text($needle, $a)];
});
Sort by levenshtein() then break ties with similar_text()'s percent PHP7.4+ (Demo)
usort($testStrings, fn($a, $b) =>
[levenshtein($needle, $a), similar_text($needle, $b)]
<=>
[levenshtein($needle, $b), similar_text($needle, $a)]
);
Personally, I never use anything but levenshtein() in my projects because it consistently delivers the results that I'm looking for.