avro | 易学教程

Kafka Avro Consumer with Decoder issues

阅读更多关于 Kafka Avro Consumer with Decoder issues

问题 When I attempted to run Kafka Consumer with Avro over the data with my respective schema,it returns an error of "AvroRuntimeException: Malformed data. Length is negative: -40" . I see others have had similar issues converting byte array to json, Avro write and read, and Kafka Avro Binary *coder. I have also referenced this Consumer Group Example, which have all been helpful, however no help with this error thus far.. It works up until this part of code (line 73) Decoder decoder =

How to fix Expected start-union. Got VALUE_NUMBER_INT when converting JSON to Avro on the command line?

阅读更多关于 How to fix Expected start-union. Got VALUE_NUMBER_INT when converting JSON to Avro on the command line?

问题 I'm trying to validate a JSON file using an Avro schema and write the corresponding Avro file. First, I've defined the following Avro schema named user.avsc : {"namespace": "example.avro", "type": "record", "name": "user", "fields": [ {"name": "name", "type": "string"}, {"name": "favorite_number", "type": ["int", "null"]}, {"name": "favorite_color", "type": ["string", "null"]} ] } Then created a user.json file: {"name": "Alyssa", "favorite_number": 256, "favorite_color": null} And then tried

MapReduce Job to Collect All Unique Fields in HDFS Directory of JSON

阅读更多关于 MapReduce Job to Collect All Unique Fields in HDFS Directory of JSON

问题 My question is in essence the application of this referenced question: Convert JSON to Parquet I find myself in the rather unique position of having to semi-manually curate an Avro schema for the superset of fields contained in JSON files (composed of arbitrary combinations of known resources)in an HDFS directory. This is part of an ETL pipeline I am trying to develop to convert these files to parquet for much more efficient/easier processing in Spark. I have never written a MapReduce program

Writing to Avro Data file

阅读更多关于 Writing to Avro Data file

问题 The following code simply writes data into avro format and reads and displays the same from the avro file written too. I was just trying out the example in the Hadoop definitive guide book. I was able to execute this first time. Then I got the following error. It did work for the first time. So I am not sure wat mistake i am making. This is the exception: Exception in thread "main" java.io.EOFException: No content to map to Object due to end of input at org.codehaus.jackson.map.ObjectMapper.

Scala pickling: Simple custom pickler for my own class?

阅读更多关于 Scala pickling: Simple custom pickler for my own class?

问题 I am trying to pickle some relatively-simple-structured but large-and-slow-to-create classes in a Scala NLP (natural language processing) app of mine. Because there's lots of data, it needs to pickle and esp. unpickle quickly and without bloat. Java serialization evidently sucks in this regard. I know about Kryo but I've never used it. I've also run into Apache Avro, which seems similar although I'm not quite sure why it's not normally mentioned as a suitable solution. Neither is Scala

avro php - reading from buffer

阅读更多关于 avro php - reading from buffer

问题 I am writing a php script using avro to deserialize data. I receive the data as a buffer of avro binary stream. In the avro php example, I see only an example of reading the data from a file. not a binary buffer. How can I deserialize the data? What I am looking for is a binary decoder for avro 回答1: $binaryBuffer = <get_avro_serialized_record> $writersSchema = '{ "type" : "record", "name" : "Example", "namespace" : "com.example.record", "fields" : [ { "name" : "userId", "type" : "int" .......

JsonMappingException when serializing avro generated object to json

阅读更多关于 JsonMappingException when serializing avro generated object to json

问题 I used avro-tools to generate java classes from avsc files, using: java.exe -jar avro-tools-1.7.7.jar compile -string schema myfile.avsc Then I tried to serialize such objects to json by ObjectMapper, but always got a JsonMappingException saying "not an enum" or "not a union". In my test I create the generated object using it's builder or constructor. I got such exceptions for objects of different classes... Sample Code: ObjectMapper serializer = new ObjectMapper(); // com.fasterxml.jackson

json4s serialization of string to avro specific record class

阅读更多关于 json4s serialization of string to avro specific record class

问题 I have json string which I am trying to serialize as an Avro Specific Record(scala case class extends org.apache.avro.specific.SpecificRecordBase ). Json4s throws exception rightfully for malformed json in case of normal scala case class but doesn't throw an exception for the case class which extends specific record . Instead it tries to contain with nulls(not sure if it's json4s doing this or having the specific record). Say I have this json: { "name":"Tom", "id":1, "address":{ "houseNum":

Issue Hive AvroSerDe tblProperties max length

阅读更多关于 Issue Hive AvroSerDe tblProperties max length

问题 I try to create a table with AvroSerDe. I have already tried following command to create the table: CREATE EXTERNAL TABLE gaSession ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ('avro.schema.url'='hdfs://<<url>>:<<port>>/<<path>>/<<file>>.avsc'); The creation seems to work, but following table is

Azure, Java: Read and Unzip a file which is saved in Azure Storage (Blobs) and encoded by Avro

阅读更多关于 Azure, Java: Read and Unzip a file which is saved in Azure Storage (Blobs) and encoded by Avro

问题 I have a file in Azure Storage which is zipped and then encoded by Avro as Blob. I read it and decode it as you see in the following code: public static int decodeAvroFile(String avroFile) throws Exception { GenericDatumReader<Object> reader=new GenericDatumReader<Object>(); org.apache.avro.file.FileReader<Object> fileReader= DataFileReader.openReader(new File(avroFile),reader); ByteArrayOutputStream os = new ByteArrayOutputStream(); try { Schema schema=fileReader.getSchema(); DatumWriter