BigQuery: Convert accented characters to their plain ascii equivalents

后端 未结 4 917
花落未央
花落未央 2020-12-17 04:21

I have the following string:

brasília

And I need to convert to:

brasilia

Withou the ´ accent!

Ho

4条回答
  •  我在风中等你
    2020-12-17 04:34

    I like this answer explanation. You can use:

    REGEXP_REPLACE(NORMALIZE(text, NFD), r'\pM', '')
    

    As a simple example:

    WITH data AS(
      SELECT 'brasília / paçoca' AS text
    )
    
    SELECT
      REGEXP_REPLACE(NORMALIZE(text, NFD), r'\pM', '') RemovedDiacritics
    FROM data
    

    brasilia / pacoca

    UPDATE

    With the new string function Translate, it's much simpler to do it:

    WITH data AS(
      SELECT 'brasília / paçoca' AS text
    )
    
    SELECT
      translate(text, "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ", "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy") as RemovedDiacritics
    FROM data
    

    brasilia / pacoca

提交回复
热议问题