Decode URL Unix/Bash Command Line (without sed) [duplicate]

问题

I am scraping a website with curl and parsing out what I need.

The URLs are returned with Ascii encoded characters like

GET v2.12/...?fields={fieldname_of_type_Tab} HTTP/1.1

How can I convert this to UTF-8 (char) directly from the command line (ideally something I can pipe | to) so that the result is...

GET v2.12/...?fields={fieldname_of_type_Tab} HTTP/1.1

EDIT: There are a number of solutions with sed but the regex that goes along with it is quite ugly. Since the provided answer leveraging perl is very clean I hope we can leave this question open

回答1:

It's not really utf8 but html-entities

Try doing this using perl :

$ echo 'http://domain.tld/?fields=&#123;fieldname_of_type_Tab&#125' |
    perl -MHTML::Entities -pe 'decode_entities($_)'

Output :

http://domain.tld/?fields={fieldname_of_type_Tab}

来源：https://stackoverflow.com/questions/48998515/decode-url-unix-bash-command-line-without-sed

标签

bash

unix

encoding

ascii

decode

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!