问题
I am scraping a website with curl and parsing out what I need.
The URLs are returned with Ascii encoded characters like
GET v2.12/...?fields={fieldname_of_type_Tab} HTTP/1.1
How can I convert this to UTF-8 (char) directly from the command line (ideally something I can pipe |
to) so that the result is...
GET v2.12/...?fields={fieldname_of_type_Tab} HTTP/1.1
EDIT: There are a number of solutions with sed
but the regex that goes along with it is quite ugly. Since the provided answer leveraging perl is very clean I hope we can leave this question open
回答1:
It's not really utf8 but html-entities
Try doing this using perl :
$ echo 'http://domain.tld/?fields={fieldname_of_type_Tab}' |
perl -MHTML::Entities -pe 'decode_entities($_)'
Output :
http://domain.tld/?fields={fieldname_of_type_Tab}
来源:https://stackoverflow.com/questions/48998515/decode-url-unix-bash-command-line-without-sed