ConvertTo-Json and ConvertFrom-Jason with special characters

℡╲_俬逩灬. 提交于 2019-12-21 14:38:28

问题


I have a file containing some properties which value of some of them contains escape characters, for example some Urls and Regex patterns.

When reading the content and converting back to the json, with or without unescaping, the content is not correct. If I convert back to json with unescaping, some regular expression break, if I convert with unescaping, urls and some regular expressions will break.

How can I solve the problem?

Minimal Complete Verifiable Example

Here are some simple code blocks to allow you simply reproduce the problem:

Content

$fileContent = 
@"
{
    "something":  "http://domain/?x=1&y=2",
    "pattern":  "^(?!(\\`|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
}
"@

With Unescape

If I read the content and then convert the content back to json using following command:

$fileContent | ConvertFrom-Json | ConvertTo-Json | %{[regex]::Unescape($_)}

The output (which is wrong) would be:

{
    "something":  "http://domain/?x=1&y=2",
    "pattern":  "^(?!(\|\~|\!|\@|\#|\$|\||\\|\'|\")).*"
}

Without Unescape

If I read the content and then convert the content back to json using following command:

$fileContent | ConvertFrom-Json | ConvertTo-Json 

The output (which is wrong) would be:

{
    "something":  "http://domain/?x=1\u0026y=2",
    "pattern":  "^(?!(\\|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\\u0027|\\\")).*"
}

Expected Result

The expected result should be same as the input file content.


回答1:


I decided to not use Unscape, instead replace the unicode \uxxxx characters with their string values and now it works properly:

$fileContent = 
@"
{
    "something":  "http://domain/?x=1&y=2",
    "pattern":  "^(?!(\\`|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
}
"@

$fileContent | ConvertFrom-Json | ConvertTo-Json | %{
    [Regex]::Replace($_, 
        "\\u(?<Value>[a-zA-Z0-9]{4})", {
            param($m) ([char]([int]::Parse($m.Groups['Value'].Value,
                [System.Globalization.NumberStyles]::HexNumber))).ToString() } )}

Which generates the expected output:

{
    "something":  "http://domain/?x=1&y=\\2",
    "pattern":  "^(?!(\\|\\~|\\!|\\@|\\#|\\$|\\||\\\\|\\'|\\\")).*"
}



回答2:


If you don't want to rely on Regex (from @Reza Aghaei's answer), you could import the Newtonsoft JSON library. The benefit is the default StringEscapeHandling property which escapes control characters only. Another benefit is avoiding the potentially dangerous string replacements you would be doing with Regex.

This StringEscapeHandling is also the default handling of PowerShell Core (version 6 and up) because they started to use Newtonsoft internally since then. So another alternative would be to use ConvertFrom-Json and ConvertTo-Json from PowerShell Core.

Your code would look something like this if you import the Newtonsoft JSON library:

[Reflection.Assembly]::LoadFile("Newtonsoft.Json.dll")

$json = Get-Content -Raw -Path file.json -Encoding UTF8 # read file
$unescaped = [Newtonsoft.Json.Linq.JObject]::Parse($json) # similar to ConvertFrom-Json

$escapedElementValue = [Newtonsoft.Json.JsonConvert]::ToString($unescaped.apiName.Value) # similar to ConvertTo-Json
$escapedCompleteJson = [Newtonsoft.Json.JsonConvert]::SerializeObject($unescaped) # similar to ConvertTo-Json

Write-Output "Variable passed = $escapedElementValue"
Write-Output "Same JSON as Input = $escapedCompleteJson"


来源:https://stackoverflow.com/questions/47779157/convertto-json-and-convertfrom-jason-with-special-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!