Converting Microsoft Word special characters with PHP

前端 未结 4 565
忘了有多久
忘了有多久 2020-12-24 08:01

I am trying to convert Word text pasted by users that contain MS Word ellipsis and long dash before processing it further.

I found an old proposed solution here to t

4条回答
  •  一向
    一向 (楼主)
    2020-12-24 08:21

    For anyone getting the diamond question mark in PHP, this method of replacing UTF-8 characters worked better than using the chr function.

    $search = [                 // www.fileformat.info/info/unicode//  = 2018
                    "\xC2\xAB",     // « (U+00AB) in UTF-8
                    "\xC2\xBB",     // » (U+00BB) in UTF-8
                    "\xE2\x80\x98", // ‘ (U+2018) in UTF-8
                    "\xE2\x80\x99", // ’ (U+2019) in UTF-8
                    "\xE2\x80\x9A", // ‚ (U+201A) in UTF-8
                    "\xE2\x80\x9B", // ‛ (U+201B) in UTF-8
                    "\xE2\x80\x9C", // “ (U+201C) in UTF-8
                    "\xE2\x80\x9D", // ” (U+201D) in UTF-8
                    "\xE2\x80\x9E", // „ (U+201E) in UTF-8
                    "\xE2\x80\x9F", // ‟ (U+201F) in UTF-8
                    "\xE2\x80\xB9", // ‹ (U+2039) in UTF-8
                    "\xE2\x80\xBA", // › (U+203A) in UTF-8
                    "\xE2\x80\x93", // – (U+2013) in UTF-8
                    "\xE2\x80\x94", // — (U+2014) in UTF-8
                    "\xE2\x80\xA6"  // … (U+2026) in UTF-8
        ];
    
        $replacements = [
                    "<<", 
                    ">>",
                    "'",
                    "'",
                    "'",
                    "'",
                    '"',
                    '"',
                    '"',
                    '"',
                    "<",
                    ">",
                    "-",
                    "-",
                    "..."
        ];
    
        str_replace($search, $replacements, $string);
    

提交回复
热议问题