How to fix Expected start-union. Got VALUE_NUMBER_INT when converting JSON to Avro on the command line?

♀尐吖头ヾ 提交于 2019-11-27 07:52:17

According to the explanation by Doug Cutting,

Avro's JSON encoding requires that non-null union values be tagged with their intended type. This is because unions like ["bytes","string"] and ["int","long"] are ambiguous in JSON, the first are both encoded as JSON strings, while the second are both encoded as JSON numbers.

http://avro.apache.org/docs/current/spec.html#json_encoding

Thus your record must be encoded as:

{"name": "Alyssa", "favorite_number": {"int": 7}, "favorite_color": null}

There is a new JSON encoder in the works that should address this common issue:

https://issues.apache.org/jira/browse/AVRO-1582

https://github.com/zolyfarkas/avro

I have implemented union and its validation , just create a union schema and pass its values through postman . resgistry url is the url which you specify for properties of kafka , u also can pass dynamic values to your schema

RestTemplate template = new RestTemplate();
        HttpHeaders headers = new HttpHeaders();
        headers.setContentType(MediaType.APPLICATION_JSON);
        HttpEntity<String> entity = new HttpEntity<String>(headers);
        ResponseEntity<String> response = template.exchange(""+registryUrl+"/subjects/"+topic+"/versions/"+version+"", HttpMethod.GET, entity, String.class);
        String responseData = response.getBody();
        JSONObject jsonObject = new JSONObject(responseData);
        JSONObject jsonObjectResult = new JSONObject(jsonResult);
        String getData = jsonObject.get("schema").toString();
        Schema.Parser parser = new Schema.Parser();
        Schema schema = parser.parse(getData);
        GenericRecord genericRecord = new GenericData.Record(schema);
        schema.getFields().stream().forEach(field->{
            genericRecord.put(field.name(),jsonObjectResult.get(field.name()));
        });
        GenericDatumReader<GenericRecord>reader = new GenericDatumReader<GenericRecord>(schema);
        boolean data = reader.getData().validate(schema,genericRecord );

As @Emre-Sevinc has pointed out, the issue is with the encoding of your Avro record.

To be more specific here;

Don't do this:

   jsonRecord = avroGenericRecord.toString

Instead, do this:

    val writer = new GenericDatumWriter[GenericRecord](avroSchema)
    val baos = new ByteArrayOutputStream
    val jsonEncoder = EncoderFactory.get.jsonEncoder(avroSchema, baos)
    writer.write(avroGenericRecord, jsonEncoder)
    jsonEncoder.flush

    val jsonRecord = baos.toString("UTF-8")

You'll also need following imports:

import org.apache.avro.Schema
import org.apache.avro.generic.{GenericData, GenericDatumReader, GenericDatumWriter, GenericRecord}
import org.apache.avro.io.{DecoderFactory, EncoderFactory}

After you do this, you'll get jsonRecord with non-null union values tagged with their intended type.

Hope this helps !

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!