Simulating LIKE in PHP

一笑奈何 提交于 2019-12-19 03:07:18

问题


Is there a way to simulate the LIKE operator of SQL in PHP with the same syntax? (% and _ wildcards and a generic $escape escape character)? So that having:

$value LIKE $string ESCAPE $escape

you can have a function that returns the PHP evaluation of that without using the database? (consider that the $value, $string and $escape values are already set).


回答1:


OK, after much fun and games here's what I have come up with:

function preg_sql_like ($input, $pattern, $escape = '\\') {

    // Split the pattern into special sequences and the rest
    $expr = '/((?:'.preg_quote($escape, '/').')?(?:'.preg_quote($escape, '/').'|%|_))/';
    $parts = preg_split($expr, $pattern, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

    // Loop the split parts and convert/escape as necessary to build regex
    $expr = '/^';
    $lastWasPercent = FALSE;
    foreach ($parts as $part) {
        switch ($part) {
            case $escape.$escape:
                $expr .= preg_quote($escape, '/');
                break;
            case $escape.'%':
                $expr .= '%';
                break;
            case $escape.'_':
                $expr .= '_';
                break;
            case '%':
                if (!$lastWasPercent) {
                    $expr .= '.*?';
                }
                break;
            case '_':
                $expr .= '.';
                break;
            default:
                $expr .= preg_quote($part, '/');
                break;
        }
        $lastWasPercent = $part == '%';
    }
    $expr .= '$/i';

    // Look for a match and return bool
    return (bool) preg_match($expr, $input);

}

I can't break it, maybe you can find something that will. The main way in which mine differs from @nickb's is that mine "parses"(ish) the input expression into tokens to generate a regex, rather than converting it to a regex in situ.

The first 3 arguments to the function should be fairly self explanatory. The fourth allows you to pass PCRE modifiers to affect the final regex used for the match. The main reason I put this in is to allow you to pass i so it is case insensitive - I can't think of any other modifiers that will be safe to use but that may not be the case. Removed per comments below

Function simply returns a boolean indicating whether the $input text matched the $pattern or not.

Here's a codepad of it

EDIT Oops, was broken, now fixed. New codepad

EDIT Removed fourth argument and made all matches case-insensitive per comments below

EDIT A couple of small fixes/improvements:

  • Added start/end of string assertions to generated regex
  • Added tracking of last token to avoid multiple .*? sequences in generated regex



回答2:


This is basically how you would implement something like this:

$input = '%ST!_ING_!%';
$value = 'ANYCHARS HERE TEST_INGS%';

// Mapping of wildcards to their PCRE equivalents
$wildcards = array( '%' => '.*?', '_' => '.');

// Escape character for preventing wildcard functionality on a wildcard
$escape = '!';

// Shouldn't have to modify much below this

$delimiter = '/'; // regex delimiter

// Quote the escape characters and the wildcard characters
$quoted_escape = preg_quote( $escape);
$quoted_wildcards = array_map( function( $el) { return preg_quote( $el); }, array_keys( $wildcards));

// Form the dynamic regex for the wildcards by replacing the "fake" wildcards with PRCE ones
$temp_regex = '((?:' . $quoted_escape . ')?)(' . implode( '|', $quoted_wildcards) . ')';

// Escape the regex delimiter if it's present within the regex
$wildcard_replacement_regex = $delimiter . str_replace( $delimiter, '\\' . $delimiter, $temp_regex) . $delimiter;

// Do the actual replacement
$regex = preg_replace_callback( $wildcard_replacement_regex, function( $matches) use( $wildcards) { return !empty( $matches[1]) ? preg_quote( $matches[2]) : $wildcards[$matches[2]]; }, preg_quote( $input)); 

// Finally, test the regex against the input $value, escaping the delimiter if it's present
preg_match( $delimiter . str_replace( $delimiter, '\\' . $delimiter, $regex) . $delimiter .'i', $value, $matches);

// Output is in $matches[0] if there was a match
var_dump( $matches[0]);

This forms a dynamic regex based on $wildcards and $escape in order to replace all "fake" wildcards with their PCRE equivalents, unless the "fake" wildcard character is prefixed with the escape character, in which case, no replacement is made. In order to do this replacement, the $wildcard_replacement_regex is created.

The $wildcard_replacement_regex looks something like this once everything's all said and done:

/((?:\!)?)(%|_)/

So it uses two capturing groups to (optionally) grab the escape character and one of the wildcards. This enables us to test to see if it grabbed the escape character in the callback. If it was able to get the escape character before the wildcard, $matches[1] will contain the escape character. If not, $matches[1] will be empty. This is how I determine whether to replace the wildcard with its PCRE equivalent, or leave it alone by just preg_quote()-ing it.

You can play around with it at codepad.




回答3:


You can use regexp, for example: preg_match.




回答4:


The other examples were a bit too complex for my taste (and painful to my clean code eyes), so I reimplemented the functionality in this simple method:

public function like($needle, $haystack, $delimiter = '~')
{
    // Escape meta-characters from the string so that they don't gain special significance in the regex
    $needle = preg_quote($needle, $delimiter);

    // Replace SQL wildcards with regex wildcards
    $needle = str_replace('%', '.*?', $needle);
    $needle = str_replace('_', '.', $needle);

    // Add delimiters, beginning + end of line and modifiers
    $needle = $delimiter . '^' . $needle . '$' . $delimiter . 'isu';

    // Matches are not useful in this case; we just need to know whether or not the needle was found.
    return (bool) preg_match($needle, $haystack);
}

Modifiers:

  • i: Ignore casing.
  • s: Make dot metacharacter match anything, including newlines.
  • u: UTF-8 compatibility.


来源:https://stackoverflow.com/questions/11434305/simulating-like-in-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!