Powershell Invoke-WebRequest and character encoding

半世苍凉 提交于 2021-02-19 06:10:18

问题


I am trying to get information from the Spotify database through their Web API. However, I'm facing issues with accented vowels (ä,ö,ü etc.)

Lets take Tiësto as an example. Spotify's API Browser can display the information correctly: https://developer.spotify.com/web-api/console/get-artist/?id=2o5jDhtHVPhrJdv3cEQ99Z

If I make a API call with Invoke-Webrequest I get

Ti??sto

as name:

function Get-Artist {
param($ArtistID = '2o5jDhtHVPhrJdv3cEQ99Z',
      $AccessToken = 'MyAccessToken')


$URI = "https://api.spotify.com/v1/artists/{0}" -f $ArtistID

$JSON = Invoke-WebRequest -Uri $URI -Headers @{"Authorization"= ('Bearer  ' + $AccessToken)} 
$JSON = $JSON | ConvertFrom-Json
return $JSON
}

How can I get the correct name?


回答1:


Jeroen Mostert, in a comment on the question, explains the problem well:

The problem is that Spotify is (unwisely) not returning the encoding it's using in its headers. PowerShell obeys the standard by assuming ISO-8859-1, but unfortunately the site is using UTF-8. (PowerShell ought to ignore standards here and assume UTF-8, but that's just like, my opinion, man.) More details here, along with the follow-up ticket.

A workaround that doesn't require the use of temporary files is to re-encode the incorrectly read string.

If we assume the presence of a function convertFrom-MisinterpretedUtf8,we can use the following:

$JSON = convertFrom-MisinterpretedUtf8 (Invoke-WebRequest -Uri $URI ...)

See below for the function's definition.


Utility function convertFrom-MisinterpretedUtf8:

function convertFrom-MisinterpretedUtf8([string] $String) {
  [System.Text.Encoding]::UTF8.GetString(
     [System.Text.Encoding]::GetEncoding(28591).GetBytes($String)
  )
}

The function converts the incorrectly read string back to bytes based on the mistakenly applied encoding (ISO-8859-1) and then recreates the string based on the actual encoding (UTF-8).




回答2:


Issue solved with the workaround provided by Jeron Mostert. You have to save it in a file and explicit tell Powershell which Encoding it should use. This workaround works for me because my program can take whatever time it needs (regarding read/write IO)

function Invoke-SpotifyAPICall {
param($URI,
      $Header = $null,
      $Body = $null
      )

if($Header -eq $null) {
    Invoke-WebRequest -Uri $URI -Body $Body -OutFile ".\SpotifyAPICallResult.txt"    
} elseif($Body -eq $null) {
    Invoke-WebRequest -Uri $URI -Headers $Header -OutFile ".\SpotifyAPICallResult.txt"
}

$JSON = Get-Content ".\SpotifyAPICallResult.txt" -Encoding UTF8 -Raw | ConvertFrom-JSON
Remove-Item ".\SpotifyAPICallResult.txt" -Force
return $JSON

}

function Get-Artist {
    param($ArtistID = '2o5jDhtHVPhrJdv3cEQ99Z',
          $AccessToken = 'MyAccessToken')


    $URI = "https://api.spotify.com/v1/artists/{0}" -f $ArtistID

    return (Invoke-SpotifyAPICall -URI $URI -Header @{"Authorization"= ('Bearer  ' + $AccessToken)})
}


Get-Artist


来源:https://stackoverflow.com/questions/47952689/powershell-invoke-webrequest-and-character-encoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!