Inserting HTML tag in the middle of Arabic word breaks word connection (cursive)

后端 未结 2 834
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-01 04:52

From wikipedia:

Cursive (from Latin curro, currere, cucurri, cursum, to run, hasten) is any style of handwriting that is designed for writing notes an

2条回答
  •  春和景丽
    2020-12-01 05:18

    Update 2020/5

    Google Chrome (Checked version 81.0.4044.138) and Firefox (76.0.1) have solved this issue when rendreing Arabic and Farsi words and there is no more need to handle the situation manually. Simply wrap the keyword with Keyword works fine with both connecting and non-connecting characters.

    Main post:

    After 7 years of accepted answer I would like to add a new answer with more practical details as my native language is Farsi. I assume that we want to replace a keyword within a long word. This answer considers the following details:

    1- Sometimes it is not enough to add only to the previous character becase next character should also has a tail to complete the connection.

    body{font-size:36pt;}
    span{color:red}
    Wrong: مک‍انیک
    
    Correct: مک‍‍انیک

    2- We may also need to add ‍ after the keyword to connect it to next character.

    body{font-size:36pt;}
    span{color:red}
    Wrong: مک‍‍انیکی
    
    Correct: مک‍‍انیک‍‍ی

    3- There are some characters that accept tail before but not after. So we have to exclude them from accepting tail after them. This is the list of non-connecting characters to next characters: ا آ د ذ ر ز ژ و

    4- Finally to respect search engines and scrappers, I recommend using javascript (jquery) to replace keywords after DOM ready to keep the page source clean.

    This is my final code with regards to all details above:

    $(document).ready(function(){
    		
      var tail="\u200D";
      var keyword="ستر";
      
      $(".searchableContent").each(function(){
        var htm=$(this).html();
       
        /*
        preserve keywords which have space both before and after 
        with a temp sign say #fullHolder#
        */
        htm=htm.split(' '+keyword+' ').join(' #fullHolder# ');
        
        /*
        preserve keywords which have only space after 
        with a temp sign say #preHolder#
        */
        htm=htm.split(keyword+' ').join('#preHolder#'+' ');
        
        /*
        preserve keywords which have only space before 
        with a temp sign say #nextHolder#
        */
        htm=htm.split(' '+keyword).join(' '+'#nextHolder#');
        
        /*
        replace remaining keywords with marked up span.
        Add tail to both side of span to make sure it is
        connected to both letters before and after
        */
        htm=htm.split(keyword).join(tail+''+tail+keyword+tail+''+tail);
        
        //Deal #preHolder# by adding tail only before the keyword
        htm=htm.split('#preHolder#'+' ').join(tail+''+tail+keyword+''+' ');
        
        //Deal #nextHolder# by adding tail only after the keyword   
        htm=htm.split(' '+'#nextHolder#').join(' '+''+keyword+tail+''+tail);
        
        //Deal #fullHolder# by adding markup only without tail
        htm=htm.split(' '+'#fullHolder#'+' ').join(' '+''+keyword+''+' ');
    				
       //Remove all possible combination of added tails to non-connecting characters
       var nonConnectings=['ا','آ','د','ذ','ر','ز','ژ','و'];
       
       for (x = 0; x < nonConnectings.length; x++) {
        htm=htm.split(nonConnectings[x]+tail).join(nonConnectings[x]);
        htm=htm.split(nonConnectings[x]+''+tail).join(nonConnectings[x]+'');
        htm=htm.split(nonConnectings[x]+''+tail).join(nonConnectings[x]+'');
       }
       
       $(this).html(htm);
      })
    })
    div{font-size:26pt}
    
    
    سترون - بستری - آستر - بستر - استراحت

提交回复
热议问题