exchanging values of a variable, by values of an array, but under condition

前端未结

关注

 2  505

难免孤独 2021-01-14 11:32

I have a code that compares the output with the values of the array, and only terminates the operation with words in the array:

First code(just a example

2条回答

半阙折子戏 (楼主)

2021-01-14 12:08

My previous method was incredibly inefficient. I didn't realize how much data you were processing, but if we are upwards of 4000 lines, then efficiency is vital (I think I my brain was stuck thinking about strtr() related processing based on your previous question(s)). This is my new/improved solution which I expect to leave my previous solution in the dust.

Code: (Demo)

$myVar="My sister alannis Is not That blonde, here is a good place. I know Ariane is not MY SISTER!"; echo "$myVar\n"; $myWords=array( array("is","é"), array("on","no"), array("that","aquela"), array("sister","irmã"), array("my","minha"), array("myth","mito"), array("he","ele"), array("good","bom"), array("ace","perito"), array("i","eu") // notice I must be lowercase ); $translations=array_combine(array_column($myWords,0),array_column($myWords,1)); // or skip this step and just declare $myWords as key-value pairs // length sorting is not necessary // preg_quote() and \Q\E are not used because dealing with words only (no danger of misinterpretation by regex) $pattern='/\b(?>'.implode('|',array_keys($translations)).')\b/i'; // atomic group is slightly faster (no backtracking) /* echo $pattern; makes: /\b(?>is|on|that|sister|my|myth|he|good|ace)\b/i demo: https://regex101.com/r/DXTtDf/1 */ $translated=preg_replace_callback( $pattern, function($m)use($translations){ // bring $translations (lookup) array to function $encoding='UTF-8'; // default setting $key=mb_strtolower($m[0],$encoding); // standardize keys' case for lookup accessibility if(ctype_lower($m[0])){ // treat as all lower return $translations[$m[0]]; }elseif(mb_strlen($m[0],$encoding)>1 && ctype_upper($m[0])){ // treat as all uppercase return mb_strtoupper($translations[$key],$encoding); }else{ // treat as only first character uppercase return mb_strtoupper(mb_substr($translations[$key],0,1,$encoding),$encoding) // uppercase first .mb_substr($translations[$key],1,mb_strlen($translations[$key],$encoding)-1,$encoding); // append remaining lowercase } }, $myVar); echo $translated;

Output:

My sister alannis Is not That blonde, here is a good place. I know Ariane is not MY SISTER! Minha irmã alannis É not Aquela blonde, here é a bom place. Eu know Ariane é not MINHA IRMÃ!

This method:

does only 1 pass through $myVar, not 1 pass for every subarray of $myWords.

does not bother with sorting the lookup array ($myWords/$translations).

does not bother with regex escaping (preg_quote()) or making pattern components literal (\Q..\E) because only words are being translated.

uses word boundaries so that only complete word matches are replaced.

uses an atomic group as a micro-optimization which maintains accuracy while denying backtracking.

declares an $encoding value for stability / maintainability / re-usability.

matches with case-insensitivity but replaces with case-sensitivity ...if the English match is:

All lowercase, so is the replacement

All uppercase (and larger than a single character), so is the replacement

Capitalized (only first character of multi-character string), so is the replacement

0 讨论(0)

查看其它2个回答

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复