How to remove all of the diacritics from a file?

前端 未结 9 1594
Happy的楠姐
Happy的楠姐 2020-12-05 00:10

I have a file containing many vowels with diacritics. I need to make these replacements:

  • Replace ā, á, ǎ, and à with a.
  • Replace ē, é, ě, and è with
9条回答
  •  陌清茗
    陌清茗 (楼主)
    2020-12-05 00:35

    You can use man iso_8859_1 (or your char set) or od -bc to identify the the octal representation of the diacritic. Then use gawk to do the replacing.

    { gsub(/\344/,"a"; print $0 }
    

    This replaces ä with a.

提交回复
热议问题