How do I convert Word smart quotes and em dashes in a string?

前端 未结 13 1756
星月不相逢
星月不相逢 2020-11-29 03:11

I have a form with a textarea. Users enter a block of text which is stored in a database.

Occasionally a user will paste text from Word containing smart quotes or em

13条回答
  •  被撕碎了的回忆
    2020-11-29 03:38

    If you were looking to escape these characters for the web while preserving their appearance, so your strings will appear like this: “It’s nice!” rather than "It's boring"...

    You can do this by using your own custom htmlEncode function in place of PHP's htmlentities():

    $trans_tbl = false;
    
    function htmlEncode($text) {
    
      global $trans_tbl;
    
      // create translation table once
      if(!$trans_tbl) {
        // start with the default set of conversions and add more.
    
        $trans_tbl = get_html_translation_table(HTML_ENTITIES); 
    
        $trans_tbl[chr(130)] = '‚';    // Single Low-9 Quotation Mark
        $trans_tbl[chr(131)] = 'ƒ';    // Latin Small Letter F With Hook
        $trans_tbl[chr(132)] = '„';    // Double Low-9 Quotation Mark
        $trans_tbl[chr(133)] = '…';    // Horizontal Ellipsis
        $trans_tbl[chr(134)] = '†';    // Dagger
        $trans_tbl[chr(135)] = '‡';    // Double Dagger
        $trans_tbl[chr(136)] = 'ˆ';    // Modifier Letter Circumflex Accent
        $trans_tbl[chr(137)] = '‰';    // Per Mille Sign
        $trans_tbl[chr(138)] = 'Š';    // Latin Capital Letter S With Caron
        $trans_tbl[chr(139)] = '‹';    // Single Left-Pointing Angle Quotation Mark
        $trans_tbl[chr(140)] = 'Œ';    // Latin Capital Ligature OE
    
        // smart single/ double quotes (from MS)
        $trans_tbl[chr(145)] = '‘'; 
        $trans_tbl[chr(146)] = '’'; 
        $trans_tbl[chr(147)] = '“'; 
        $trans_tbl[chr(148)] = '”'; 
    
        $trans_tbl[chr(149)] = '•';    // Bullet
        $trans_tbl[chr(150)] = '–';    // En Dash
        $trans_tbl[chr(151)] = '—';    // Em Dash
        $trans_tbl[chr(152)] = '˜';    // Small Tilde
        $trans_tbl[chr(153)] = '™';    // Trade Mark Sign
        $trans_tbl[chr(154)] = 'š';    // Latin Small Letter S With Caron
        $trans_tbl[chr(155)] = '›';    // Single Right-Pointing Angle Quotation Mark
        $trans_tbl[chr(156)] = 'œ';    // Latin Small Ligature OE
        $trans_tbl[chr(159)] = 'Ÿ';    // Latin Capital Letter Y With Diaeresis
    
        ksort($trans_tbl);
      }
    
      // escape HTML      
      return strtr($text, $trans_tbl); 
    }
    

提交回复
热议问题