How to extract schema for avro file in python

落爺英雄遲暮 提交于 2019-11-30 08:53:06

I use python 3.4 and Avro package 1.7.7

For schema file use:

reader = avro.datafile.DataFileReader(open('file_name.avro',"rb"),avro.io.DatumReader())
schema = reader.meta
print(schema) 

A direct examination of /usr/local/lib/python2.7/site-packages/avro/datafile.py reveals the answer:

reader = avro.datafile.DataFileReader(input,avro.io.DatumReader())
schema = reader.datum_reader.writers_schema
print schema

Curiously, in Java there is a special method for that: reader.getSchema().

In my case in order to get the schema as a "consumable" python dictionary containing useful info such schema name and so on I did the following:

reader: DataFileReader = DataFileReader(open(avro_file, 'rb'), DatumReader())
schema: dict = json.loads(reader.meta.get('avro.schema').decode('utf-8'))

The reader.meta is a dictionary pretty useless "as is", since it contains 2 keys: avro.codec and avro.schema that are both bytes objects (so I had to parse it in order to access to properties).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!