I\'m writing PHP code to parse a string. It needs to be as fast as possible, so are regular expressions the way to go? I have a hunch that PHP string functions are more expe
I was searching some information about regex performance - as I need to do a lot of lookups - and truth is that is depends on what you want to achieve. For my purpose I tested one type of searching to compare performance.
Specification:
I need to find simple string in array of strings.
To test I have $testArray which is array of ~11k multi word phrases build from article about Tolkien (eg. strings "history of the lord of the rings", "christopher tolkien").
As I want to find only phrases containing exact word I cant use strpos() function as eg. when searching for "ring" it would find also phrases with "ringtone" word.
Code using php functions:
$results = array();
$searchWord = 'rings';
foreach ($testArray as $phrase){
$phraseArr = explode(' ', $phrase);
if(in_array($searchWord, $phraseArr)){
$results[] = $phrase;
}
}
Code using regex function:
$results = array();
$pattern= "/( |^)rings( |$)/";
$results = preg_grep($pattern, $testArray);
I found out in this case regex function was around 10 times faster
Execution times for 100 searches was (using various words)
Such searching might be trival, but for more complex tasks I assume it would be extremely hard/impossible to implement it without regex just on native php functions.
In conclusion: for simple tasks you should use regex beacuse it would propably be faster, and for complex tasks you propably have to use regex beacuse it would be only way to solve problem.
I just realize that this topic is about "PHP string functions" and my test code uses explode() and in_array() functions. So I tried other approach. As my delimiter is space search method below also works and uses strpos() function.
Code using strpos() function:
$results = array();
$searchWord = 'rings';
foreach ($testArray as $phrase){
if(strpos(' ' . $phrase . ' ', ' ' . $searchWord . ' ')!==FALSE){
$results[] = $phrase;
}
}
But still results was a lot worse than in regex case.
So performance summary is:
strpos() functionStill regex is a big winner.