问题
I'm trying to figure out a clean way to parse streaming JSON with Jackson. "Streaming" as in TCP, off-the-wire, in a piecemeal fashion without any guarantee of receiving complete JSON data in a single read (no message framing either). Also, the goal is to do this asynchronously, which rules out relying on Jackson's handling of java.io.InputStream
s. I came up with a functioning solution (see demonstration below), but I'm not particularly happy with it. Imperative style aside, I don't like the ungraceful handling of incomplete JSON by JsonParser#readValueAsTree
. When processing a stream of bytes, incomplete data is absolutely normal and is not an exceptional scenario, so it's strange (and unacceptable) to see java.io.IOException
s in Jackson's APIs. I also looked into using Jackson's TokenBuffer
, but ran into similar issues. Is Jackson not really meant for processing true streaming JSON?
package com.example.jackson;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.LinkedList;
import java.util.List;
import static java.nio.charset.StandardCharsets.UTF_8;
import static java.util.Collections.emptyList;
public class AsyncJsonParsing {
public static void main(String[] args) {
final AsyncJsonParsing parsing = new AsyncJsonParsing();
parsing.runFirstScenario();
parsing.runSecondScenario();
parsing.runThirdScenario();
parsing.runFourthScenario();
}
static final class ParsingOutcome {
final List<JsonNode> roots;//list of parsed JSON objects and JSON arrays
final byte[] remainder;
ParsingOutcome(final List<JsonNode> roots, final byte[] remainder) {
this.roots = roots;
this.remainder = remainder;
}
}
final byte[] firstMessage = "{\"message\":\"first\"}".getBytes(UTF_8);
final byte[] secondMessage = "{\"message\":\"second\"}".getBytes(UTF_8);
final byte[] leadingHalfOfFirstMessage = Arrays.copyOfRange(firstMessage, 0, firstMessage.length / 2);
final byte[] trailingHalfOfFirstMessage = Arrays.copyOfRange(firstMessage, firstMessage.length / 2, firstMessage.length);
final byte[] leadingHalfOfSecondMessage = Arrays.copyOfRange(secondMessage, 0, secondMessage.length / 2);
final byte[] trailingHalfOfSecondMessage = Arrays.copyOfRange(secondMessage, secondMessage.length / 2, secondMessage.length);
final ObjectMapper mapper = new ObjectMapper();
void runFirstScenario() {
//expectation: remainder = empty array and roots has a single element - parsed firstMessage
final ParsingOutcome result = parse(firstMessage, mapper);
report(result);
}
void runSecondScenario() {
//expectation: remainder = leadingHalfOfFirstMessage and roots is empty
final ParsingOutcome firstResult = parse(leadingHalfOfFirstMessage, mapper);
report(firstResult);
//expectation: remainder = empty array and roots has a single element - parsed firstMessage
final ParsingOutcome secondResult = parse(concat(firstResult.remainder, trailingHalfOfFirstMessage), mapper);
report(secondResult);
}
void runThirdScenario() {
//expectation: remainder = leadingHalfOfSecondMessage and roots has a single element - parsed firstMessage
final ParsingOutcome firstResult = parse(concat(firstMessage, leadingHalfOfSecondMessage), mapper);
report(firstResult);
//expectation: remainder = empty array and roots has a single element - parsed secondMessage
final ParsingOutcome secondResult = parse(concat(firstResult.remainder, trailingHalfOfSecondMessage), mapper);
report(secondResult);
}
void runFourthScenario() {
//expectation: remainder = empty array and roots has two elements - parsed firstMessage, followed by parsed secondMessage
final ParsingOutcome result = parse(concat(firstMessage, secondMessage), mapper);
report(result);
}
static void report(final ParsingOutcome result) {
System.out.printf("Remainder of length %d: %s%n", result.remainder.length, Arrays.toString(result.remainder));
System.out.printf("Total of %d parsed JSON roots: %s%n", result.roots.size(), result.roots);
}
static byte[] concat(final byte[] left, final byte[] right) {
final byte[] union = Arrays.copyOf(left, left.length + right.length);
System.arraycopy(right, 0, union, left.length, right.length);
return union;
}
static ParsingOutcome parse(final byte[] chunk, final ObjectMapper mapper) {
final List<JsonNode> roots = new LinkedList<>();
JsonParser parser;
JsonNode root;
try {
parser = mapper.getFactory().createParser(chunk);
root = parser.readValueAsTree();
} catch (IOException e) {
return new ParsingOutcome(emptyList(), chunk);
}
byte[] remainder = new byte[0];
try {
while(root != null) {
roots.add(root);
remainder = extractRemainder(parser);
root = parser.readValueAsTree();
}
} catch (IOException e) {
//fallthrough
}
return new ParsingOutcome(roots, remainder);
}
static byte[] extractRemainder(final JsonParser parser) {
try {
final ByteArrayOutputStream baos = new ByteArrayOutputStream();
parser.releaseBuffered(baos);
return baos.toByteArray();
} catch (IOException e) {
return new byte[0];
}
}
}
To elaborate a bit further, conceptually (at least in my mind), parsing of any streaming data boils down to a simple function which accepts an array of bytes and returns a tuple of (1) a possibly empty list of parsed results and (2) an array of remaining, currently-unparsable bytes. In the snippet above, this tuple is represented by an instance of ParsingOutcome
.
来源:https://stackoverflow.com/questions/38416158/how-to-properly-parse-streaming-json-with-jackson