问题
I've been using the Newtonsoft.Json
and Newtonsoft.Json.Fsharp
libraries to create a new JSON
serializer and stream to a file. I like the ability to stream to a file because I'm handling large files and, prior to streaming, often ran into memory issues.
I stream with a simple fx:
open Newtonsoft.Json
open Newtonsoft.Json.FSharp
open System.IO
let writeToJson (path: string) (obj: 'a) : unit =
let serialized = JsonConvert.SerializeObject(obj)
let fileStream = new StreamWriter(path)
let serializer = new JsonSerializer()
serializer.Serialize(fileStream, obj)
fileStream.Close()
This works great. My problem is that the JSON
string is then absolutely cluttered with stuff I don't need. For example,
let m =
[
(1.0M, None)
(2.0M, Some 3.0M)
(4.0M, None)
]
let makeType (tup: decimal * decimal option) = {FieldA = fst tup; FieldB = snd tup}
let y = List.map makeType m
Default.serialize y
val it : string =
"[{"FieldA": 1.0},
{"FieldA": 2.0,
"FieldB": {
"Case": "Some",
"Fields": [3.0]
}},
{"FieldA": 4.0}]"
If this is written to a JSON
and read into R
, there are nested dataframes and any of the Fields
associated with a Case
end up being a list
:
library(jsonlite)
library(dplyr)
q <- fromJSON("default.json")
x <-
q %>%
flatten()
x
> x
FieldA FieldB.Case FieldB.Fields
1 1 <NA> NULL
2 2 Some 3
3 4 <NA> NULL
> sapply(x, class)
FieldA FieldB.Case FieldB.Fields
"numeric" "character" "list"
I don't want to have to handle these things in R
. I can do it but it's annoying and, if there are files with many, many columns, it's silly.
This morning, I started looking at the Microsoft.FSharpLu.Json documentation. This library has a Compact.serialize
function. Quick tests suggest that this library will eliminate the need for nested dataframes and the list
s associated with any Case
and Field
columns. For example:
Compact.serialize y
val it : string =
"[{
"FieldA": 1.0
},
{
"FieldA": 2.0,
"FieldB": 3.0
},
{
"FieldA": 4.0
}
]"
When this string is read into R
,
q <- fromJSON("compact.json")
x <- q
x
> x
FieldA FieldB
1 1 NA
2 2 3
3 4 NA
> sapply(x, class)
FieldA FieldB
"numeric" "numeric
This is much simpler to handle in R. and I'd like to start using this library.
However, I don't know if I can get the Compact
serializer to serialize to a stream. I see .serializeToFile
, .desrializeStream
, and .tryDeserializeStream
, but nothing that can serialize to a stream. Does anyone know if Compact
can handle writing to a stream? How can I make that work?
回答1:
The helper to serialize to stream is missing from the Compact
module in FSharpLu.Json, but you should be able to do it by following the C# example from
http://www.newtonsoft.com/json/help/html/SerializingJSON.htm. Something along the lines:
let writeToJson (path: string) (obj: 'a) : unit =
let serializer = new JsonSerializer()
serializer.Converters.Add(new Microsoft.FSharpLu.Json.CompactUnionJsonConverter())
use sw = new StreamWriter(path)
use writer = new JsonTextWriter(sw)
serializer.Serialize(writer, obj)
来源:https://stackoverflow.com/questions/41983955/using-microsoft-fsharplu-to-serialize-json-to-a-stream