How to encode cyrillic characters for URL and then decode them?

前端 未结 3 813
情深已故
情深已故 2021-01-18 17:17

I have a form on one page:

One of the input fi

3条回答
  •  感动是毒
    2021-01-18 18:02

    Correct solution, including spaces:

    use open ':std', ':encoding(UTF-8)';
    use Encode;
    
    my $escaped = '%41F%2F%424+%41F%41E%414%416%410%420%41A%410+%418%417+%421%412%418%41D';
    (my $unescaped = $escaped) =~ s/\+/ /g;
    $unescaped =~ s/%([[:xdigit:]]+)/chr hex $1/eg;
    print $unescaped;
    # П/Ф ПОДЖАРКА ИЗ СВИН
    

    Credit goes to Renaud Bompuis for recognising as the first that these are Unicode code-points prefixed with %.

    I wish to add that the encoding scheme from the question is very unusual, I haven't seen it before. Normally one would expect the characters string П/Ф ПОДЖАРКА ИЗ СВИН to be encoded as %D0%9F%2F%D0%A4+%D0%9F%D0%9E%D0%94%D0%96%D0%90%D0%A0%D0%9A%D0%90+%D0%98%D0%97+%D0%A1%D0%92%D0%98%D0%9D, that is to say, first the characters are encoded into UTF-8, then the octets are percent-escaped. This scheme works with the answer from Dr.Kameleon.

提交回复
热议问题