I am looking for a JSON Parser that can allow me to iterate through JSON objects from a large JSON file (with size few hundreds of MBs). I tried JsonTextReader from Json.NET
This is one of the use cases I contemplated for my own parser/deserializer.
I've recently made a simple example (by feeding the parser with JSON text that is read thru a StreamReader) of deserializing this JSON shape:
{
"fathers" : [
{
"id" : 0,
"married" : true,
"name" : "John Lee",
"sons" : [
{
"age" : 15,
"name" : "Ronald"
}
],
"daughters" : [
{
"age" : 7,
"name" : "Amy"
},
{
"age" : 29,
"name" : "Carol"
},
{
"age" : 14,
"name" : "Barbara"
}
]
},
{
"id" : 1,
"married" : false,
"name" : "Kenneth Gonzalez",
"sons" : [
],
"daughters" : [
]
},
{
"id" : 2,
"married" : false,
"name" : "Larry Lee",
"sons" : [
{
"age" : 4,
"name" : "Anthony"
},
{
"age" : 2,
"name" : "Donald"
}
],
"daughters" : [
{
"age" : 7,
"name" : "Elizabeth"
},
{
"age" : 15,
"name" : "Betty"
}
]
},
//(... etc)
]
}
... into these POCOs:
https://github.com/ysharplanguage/FastJsonParser#POCOs
(i.e., specifically: "FathersData", "Father", "Son", "Daughter")
That sample also presents:
(1) a sample filter on the relative item index in the Father[] array (e.g., to fetch only the first 10), and
(2) how to populate dynamically a property of the father's daughters, as the deserialization of their respective father returns - (that is, thanks to a delegate that the caller passes on to the parser's Parse method, for callback purposes).
For the rest of the bits, see:
ParserTests.cs : static void FilteredFatherStreamTestDaughterMaidenNamesFixup()
(lines #829 to #904)
The performance I observe on my humble laptop (*) for parsing some ~ 12MB to ~ 180MB JSON files and deserializing an arbitrary subset of their content into POCOs
(or into loosely-typed dictionaries (just (string, object) key/value pairs) also supported)
is anywhere in the ballpark from ~ 20MB/sec to 40MB/sec (**).
(e.g., ~ 300 milliseconds in the case of the 12MB JSON file, into POCOs)
More detailed info available here:
https://github.com/ysharplanguage/FastJsonParser#Performance
'HTH,
(*) (running Win7 64bit @ 2.5Ghz)
(**) (the throughput is quite dependent on the input JSON shape/complexity, e.g., sub-objects nesting depth, and other factors)