How to make the function work for pair words?

那年仲夏 提交于 2019-12-25 09:50:07

问题


My function now only works with one word. For example, I have words in an associative array. And my function replaces the array key with its value in the text. And the function to keep the words in the lower case, but when you replace words it will return the incoming word register that is written on the text. Now the function can not work with pairs of words to replace pairs of words with other pairs of words.

Example:

// Function:

function replaceKeyToValue($request, $dict){
    $response = preg_replace_callback("/\pL+/u", function ($m) use ($dict) {
        $word = mb_strtolower($m[0]);
        if (isset($dict[$word])) {
            $repl = $dict[$word];
            // Check for some common ways of upper/lower case
            // 1. all lower case
            if ($word === $m[0]) return $repl;
            // 2. all upper case
            if (mb_strtoupper($word) === $m[0]) return mb_strtoupper($repl);
            // 3. Only first letters are upper case
            if (mb_convert_case($word,  MB_CASE_TITLE) === $m[0]) return mb_convert_case($repl,  MB_CASE_TITLE);
            // Otherwise: check each character whether it should be upper or lower case
            for ($i = 0, $len = mb_strlen($word); $i < $len; ++$i) {
                $mixed[] = mb_substr($word, $i, 1) === mb_substr($m[0], $i, 1) 
                    ? mb_substr($repl, $i, 1)
                    : mb_strtoupper(mb_substr($repl, $i, 1));
            }
            return implode("", $mixed);
        }
        return $m[0]; // Nothing changes
    }, $request);
    return $response;
 }

    // Example associative array

    $dict = array
    (
      "make"=>"take",
      "cool"=>"pool",
      "узбек"=>"ӯзбек",
    );

    $text = 'Make COOL узБЕК';

    echo replaceKeyToValue($text, $dict);

Output:

Take POOL ӯзБЕК

How will the function be redone so that it can also pair words into pair words?

Example array with pairs words:

$array = array
(
  "take pool" => "pool take", 
  "get book" => "set word", 
  "узбек точик" => "ӯзбек тоҷик"
);

$example_text = "Take POOL Get BooK УзБеК ТоЧИК";

回答1:


First thing: push your case transformation out of the problem and write a dedicated function to handle it.

About the word pairs: You can solve the problem using:

  • a lookahead with an optional subpattern to capture a second word
  • a static boolean variable (defined in the callback function) to know if the previous match was the first word of an existing two words substring.

You only need this pattern:

~\b\pL+\b(?=( \pL+\b)?)~u

The lookahead allows to walk the string at each start of word (even at the end of the string since (?=( \pL+\b)?) is an always true assertion.) since it doesn't consume any character.

It's very simple:

  • the boolean variable is set to false at the beginning.
  • when the boolean is false and $m[0].$m[1] in lowercase exists in the dict keys, then set the boolean to true and return the dict value, else return $m[0]
  • when the boolean is true, set it to false and return an empty string

Advantage: You don't have to take care about the dict size. Using the same idea, you can even extend the algorithm to more words with little changes or handle a dict in which items keys have a different number of words.

Advice: when you think to change the backtracking limit or to build a giant alternation, don't do it. It only means your approach isn't the good one.



来源:https://stackoverflow.com/questions/46978603/how-to-make-the-function-work-for-pair-words

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!