Split string by delimiter, but not if it is escaped

前端 未结 5 1591
旧巷少年郎
旧巷少年郎 2020-11-27 10:58

How can I split a string by a delimiter, but not if it is escaped? For example, I have a string:

1|2\\|2|3\\\\|4\\\\\\|4

The delimiter is <

5条回答
  •  旧时难觅i
    2020-11-27 11:01

    For future readers, here is a universal solution. It is based on NikiC's idea with (*SKIP)(*FAIL):

    function split_escaped($delimiter, $escaper, $text)
    {
        $d = preg_quote($delimiter, "~");
        $e = preg_quote($escaper, "~");
        $tokens = preg_split(
            '~' . $e . '(' . $e . '|' . $d . ')(*SKIP)(*FAIL)|' . $d . '~',
            $text
        );
        $escaperReplacement = str_replace(['\\', '$'], ['\\\\', '\\$'], $escaper);
        $delimiterReplacement = str_replace(['\\', '$'], ['\\\\', '\\$'], $delimiter);
        return preg_replace(
            ['~' . $e . $e . '~', '~' . $e . $d . '~'],
            [$escaperReplacement, $delimiterReplacement],
            $tokens
        );
    }
    

    Make a try:

    // the base situation:
    $text = "asdf\\,fds\\,ddf,\\\\,f\\,,dd";
    $delimiter = ",";
    $escaper = "\\";
    print_r(split_escaped($delimiter, $escaper, $text));
    
    // other signs:
    $text = "dk!%fj%slak!%df!!jlskj%%dfl%isr%!%%jlf";
    $delimiter = "%";
    $escaper = "!";
    print_r(split_escaped($delimiter, $escaper, $text));
    
    // delimiter with multiple characters:
    $text = "aksd()jflaksd())jflkas(('()j()fkl'()()as()d('')jf";
    $delimiter = "()";
    $escaper = "'";
    print_r(split_escaped($delimiter, $escaper, $text));
    
    // escaper is same as delimiter:
    $text = "asfl''asjf'lkas'''jfkl''d'jsl";
    $delimiter = "'";
    $escaper = "'";
    print_r(split_escaped($delimiter, $escaper, $text));
    

    Output:

    Array
    (
        [0] => asdf,fds,ddf
        [1] => \
        [2] => f,
        [3] => dd
    )
    Array
    (
        [0] => dk%fj
        [1] => slak%df!jlskj
        [2] => 
        [3] => dfl
        [4] => isr
        [5] => %
        [6] => jlf
        )
    Array
    (
        [0] => aksd
        [1] => jflaksd
        [2] => )jfl'kas((()j
        [3] => fkl()
        [4] => as
        [5] => d(')jf
    )
    Array
    (
        [0] => asfl'asjf
        [1] => lkas'
        [2] => jfkl'd
        [3] => jsl
    )
    

    Note: There is a theoretical level problem: implode('::', ['a:', ':b']) and implode('::', ['a', '', 'b']) result the same string: 'a::::b'. Imploding can be also an interesting problem.

提交回复
热议问题