Importing JSON into R with in-line quotation marks

前端 未结 1 932
囚心锁ツ
囚心锁ツ 2021-01-03 06:19

I\'m attempting to read the following JSON file (\"my_file.json\") into R, which contains the following:

[{\"id\":\"484\",\"comment\":\"They call me \"Bruce\         


        
相关标签:
1条回答
  • 2021-01-03 07:01

    In R strings literals can be defined using single or double quotes.
    e.g.

    s1 <- 'hello'
    s2 <- "world"
    

    Of course, if you want to include double quotes inside a string literal defined using double quotes you need to escape (using backslash) the inner quotes, otherwise the R code parser won't be able to detect the end of the string correctly (the same holds for single quote).
    e.g.

    s1 <- "Hello, my name is \"John\""
    

    If you print (using cat¹) this string on the console, or you write this string on a file you will get the actual "face" of the string, not the R literal representation, that is :

    > cat("Hello, my name is \"John\"")
    Hello, my name is "John"
    

    The json parser, reads the actual "face" of the string, so, in your case json reads :

    [{"id":"484","comment":"They call me "Bruce""}]
    

    not (the R literal representation) :

    "[{\"id\":\"484\",\"comment\":\"They call me \"Bruce\"\"}]" 
    

    That being said, also the json parser needs double-quotes escaping when you have quotes inside strings.

    Hence, your string should be modified in this way :

    [{"id":"484","comment":"They call me \"Bruce\""}]
    

    If you simply modify your file by adding the backslashes you will be perfectly able to read the json.

    Note that the corresponding R literal representation of that string would be :

    "[{\"id\":\"484\",\"comment\":\"They call me \\\"Bruce\\\"\"}]"
    

    in fact, this works :

    > fromJSON("[{\"id\":\"484\",\"comment\":\"They call me \\\"Bruce\\\"\"}]")
       id              comment
    1 484 They call me "Bruce"
    

    ¹ the default R print function (invoked also when you simply press ENTER on a value) returns the corresponding R string literal. If you want to print the actual string, you need to use print(quote=F,stringToPrint), or cat function.


    EDIT (on @EngrStudent comment on the possibility to automatize quotes escaping) :

    Json parser cannot do quotes escaping automatically.
    I mean, try to put yourself in the computer's shoes and image you should parse this (unescaped) string as json: { "foo1" : " : "foo2" : "foo3" }

    I see at least three possible escaping giving a valid json:
    { "foo1" : " : \"foo2\" : \"foo3" }
    { "foo1\" : " : "foo2\" : \"foo3" }
    { "foo1\" : \" : \"foo2" : "foo3" }

    As you can see from this small example, escaping is really necessary to avoid ambiguities.

    Maybe, if the string you want to escape has a really particular structure where you can recognize (without uncertainty) the double-quotes needing to be escaped, you can create your own automatic escaping procedure, but you need to start from scratch, because there's nothing built-in.

    0 讨论(0)
提交回复
热议问题