How to improve PHP string match with similar_text()?

妖精的绣舞 提交于 2019-12-04 18:07:48

Levenshtein distance: http://php.net/manual/en/function.levenshtein.php

It's reverse to similar_text(), so 0% means there is no difference.

// <!-- Overcast, Rain or Showers compared Overcast, Rain or Showers is 0 -->
// <!-- Overcast, Risk of Rain or Showers compared Overcast, Rain or Showers is 11 -->
// <!-- Overcast, Chance of Rain or Showers compared Overcast, Rain or Showers is 13 -->

The Levenshtein distance is a good way to compare strings. It's faster than similar_text(), and it lets you control its output by weighting the different parts of the algorithm.

To turn Levenshtein distance into a useable "match" percentage, you can express it as a fraction of the average lengths of the source strings:

// Assume $src1 and $src2 are your source strings and at least one is non-empty

$avgLength = ( strlen( $src1 ) + strlen( $src2 ) ) / 2;
$matchFraction = 1 - ( levenshtein( $src1, $src2 ) / $avgLength );

//$matchFraction is now between 0 and 1, with 1 being equal strings and 0 being totally different
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!