Is there a way to programmatically convert JSON to AVRO Schema?

早过忘川 提交于 2019-12-04 07:14:45

you can use Kite SDK util to infer avro schema from a json input.

https://github.com/kite-sdk/kite/blob/master/kite-data/kite-data-core/src/main/java/org/kitesdk/data/spi/JsonUtil.java#L539

Example:

    String json = "{\n" +
            "    \"id\": 1,\n" +
            "    \"name\": \"A green door\",\n" +
            "    \"price\": 12.50,\n" +
            "    \"tags\": [\"home\", \"green\"]\n" +
            "}\n"
            ;
    String avroSchema = JsonUtil.inferSchema(JsonUtil.parse(json), "myschema").toString();
    System.out.println(avroSchema);

Result:

{  
   "type":"record",
   "name":"myschema",
   "fields":[  
      {  
         "name":"id",
         "type":"int",
         "doc":"Type inferred from '1'"
      },
      {  
         "name":"name",
         "type":"string",
         "doc":"Type inferred from '\"A green door\"'"
      },
      {  
         "name":"price",
         "type":"double",
         "doc":"Type inferred from '12.5'"
      },
      {  
         "name":"tags",
         "type":{  
            "type":"array",
            "items":"string"
         },
         "doc":"Type inferred from '[\"home\",\"green\"]'"
      }
   ]
}

You can find the maven dependency here

If you want to avoid creating a dedicated AVRO schema for every JSON format, you can use rec-avro package.

It allows you to take any python data structure, including parsed XML or JSON and store it in Avro without a need for a dedicated schema.

I tested it for python 3.

You can install it as pip3 install rec-avro or see the code and docs at https://github.com/bmizhen/rec-avro

I gave a json to avro example code here: https://stackoverflow.com/a/55444481/6654219

This one works cool with a simple copy and paste of avro schema.

https://toolslick.com/generation/metadata/avro-schema-from-json

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!