Using Microsoft.FSharpLu to serialize JSON to a stream

无人久伴 提交于 2020-01-05 05:48:14

问题


I've been using the Newtonsoft.Json and Newtonsoft.Json.Fsharp libraries to create a new JSON serializer and stream to a file. I like the ability to stream to a file because I'm handling large files and, prior to streaming, often ran into memory issues.

I stream with a simple fx:

open Newtonsoft.Json
open Newtonsoft.Json.FSharp 
open System.IO

let writeToJson (path: string) (obj: 'a) : unit =
    let serialized = JsonConvert.SerializeObject(obj)
    let fileStream = new StreamWriter(path)
    let serializer = new JsonSerializer()

    serializer.Serialize(fileStream, obj)

    fileStream.Close()

This works great. My problem is that the JSON string is then absolutely cluttered with stuff I don't need. For example,

let m = 
    [
        (1.0M, None)
        (2.0M, Some 3.0M)
        (4.0M, None)
    ]

let makeType (tup: decimal * decimal option) = {FieldA = fst tup; FieldB = snd tup}

let y = List.map makeType m

Default.serialize y

val it : string =
  "[{"FieldA": 1.0},
    {"FieldA": 2.0,
     "FieldB": {
        "Case": "Some",
        "Fields": [3.0]
    }},
    {"FieldA": 4.0}]"

If this is written to a JSON and read into R, there are nested dataframes and any of the Fields associated with a Case end up being a list:

library(jsonlite)
library(dplyr)

q <- fromJSON("default.json")

x <- 
    q %>%
    flatten()

x

> x
  FieldA FieldB.Case FieldB.Fields
1      1        <NA>          NULL
2      2        Some             3
3      4        <NA>          NULL
> sapply(x, class)
       FieldA   FieldB.Case FieldB.Fields 
    "numeric"   "character"        "list"

I don't want to have to handle these things in R. I can do it but it's annoying and, if there are files with many, many columns, it's silly.

This morning, I started looking at the Microsoft.FSharpLu.Json documentation. This library has a Compact.serialize function. Quick tests suggest that this library will eliminate the need for nested dataframes and the lists associated with any Case and Field columns. For example:

Compact.serialize y

val it : string =
  "[{
    "FieldA": 1.0
    },
  {
    "FieldA": 2.0,
    "FieldB": 3.0
  },
  {
    "FieldA": 4.0
  }
  ]"

When this string is read into R,

q <- fromJSON("compact.json")

x <- q
x
> x
  FieldA FieldB
1      1     NA
2      2      3
3      4     NA
> sapply(x, class)
   FieldA    FieldB 
"numeric" "numeric

This is much simpler to handle in R. and I'd like to start using this library.

However, I don't know if I can get the Compact serializer to serialize to a stream. I see .serializeToFile, .desrializeStream, and .tryDeserializeStream, but nothing that can serialize to a stream. Does anyone know if Compact can handle writing to a stream? How can I make that work?


回答1:


The helper to serialize to stream is missing from the Compact module in FSharpLu.Json, but you should be able to do it by following the C# example from http://www.newtonsoft.com/json/help/html/SerializingJSON.htm. Something along the lines:

let writeToJson (path: string) (obj: 'a) : unit =
    let serializer = new JsonSerializer()
    serializer.Converters.Add(new Microsoft.FSharpLu.Json.CompactUnionJsonConverter())
    use sw = new StreamWriter(path)
    use writer = new JsonTextWriter(sw)
    serializer.Serialize(writer, obj)


来源:https://stackoverflow.com/questions/41983955/using-microsoft-fsharplu-to-serialize-json-to-a-stream

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!