avro

Problems in creating scheme .avsc Avro

☆樱花仙子☆ 提交于 2019-12-11 00:19:43
问题 I'm having trouble creating a avro scheme, below I will put my scheme. twitter.avsc: { "type" : "record", "name" : "twitter_schema", "namespace" : "com.miguno.avro", "fields" : [ { "name" : "_id", "type" : "record", "doc" : "Values of the indexes/id tweets"}, { "name" : "nome","type" : "string","doc" : "Name of the user account on Twitter.com" }, { "name" : "tweet", "type" : "string","doc" : "The content of the user's Twitter message" }, { "name" : "datahora", "type" : "string","doc" : "Unix

Sqoop import as Avro data file gives all values as NULL when creating external Avro table in Hive

故事扮演 提交于 2019-12-10 23:29:17
问题 I am trying to import Oracle DB data into HDFS using Sqoop import free-form query by joining two tables using '--as-avrodatafile' using Oozie scheduler. Following is the content of my workflow.xml: <?xml version="1.0" encoding="UTF-8"?> <workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-freeform-wf"> <start to="sqoop-freeform-node"/> <action name="sqoop-freeform-node"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node>

kafka Avro message deserializer for multiple topics

做~自己de王妃 提交于 2019-12-10 23:24:46
问题 I am trying to desirialize kafka message in avro format I am using following code: https://github.com/ivangfr/springboot-kafka-debezium-ksql/blob/master/kafka-research-consumer/src/main/java/com/mycompany/kafkaresearchconsumer/kafka/ReviewsConsumerConfig.java Above code work fine for me as single topic but i have to listen messages from multiple topic and created multiple AvroGenerated files but i stuck in configuration as confiration need multipe avro type objects. Please consider below

B cannot be cast to java.nio.ByteBufferwhen trying to serialize avro record

被刻印的时光 ゝ 提交于 2019-12-10 19:23:07
问题 I have written a small Java program that is supposed to monitor a directory for new files and send them in binay Avro format to a Kafka topic. I am new to Avro and I wrote this using Avro documentation and online examples. The monitoring part works well, but the program fails at runtime when it gets to the Avro serialization. I get this error stack: Exception in thread "main" java.lang.ClassCastException: [B cannot be cast to java.nio.ByteBuffer at org.apache.avro.generic.GenericDatumWriter

Read Existing Avro File and Send to Kafka

自闭症网瘾萝莉.ら 提交于 2019-12-10 12:25:11
问题 I have an existing Avro File with the schema. I need to send the file to Producer. Following is the code i have written. public class ProducerDataSample { public static void main(String[] args) { String topic = "my-topic"; Schema.Parser parser = new Schema.Parser(); Schema schema = parser.parse(AvroSchemaDefinitionLoader.fromFile("encounter.avsc").get()); File file = new File("/home/hello.avro"); try{ ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); DatumWriter<GenericRecord>

Infinite recursion in createDataFrame for avro types

社会主义新天地 提交于 2019-12-10 09:27:51
问题 I'm getting a StackOverflowError from inside the createDataFrame call in this example. It originates in scala code involving java type inferencing which calls itself in an infinite loop. final EventParser parser = new EventParser(); JavaRDD<Event> eventRDD = sc.textFile(path) .map(new Function<String, Event>() { public Event call(String line) throws Exception { Event event = parser.parse(line); log.info("event: "+event.toString()); return event; } }); log.info("eventRDD:" + eventRDD

Getting Started with Avro

落花浮王杯 提交于 2019-12-08 22:58:53
问题 I want to get started with using Avro with Map Reduce. Can Someone suggest a good tutorial / example to get started with. I couldnt find much through the internet search. 回答1: I recently did a project that was heavily based on Avro data and not having used this data format before, I had to start from scratch. You are right in that it is rather hard to get much help from online sources when getting started with Avro. The material that I would recommend to you is: By far, the most helpful

Spark Streaming From Kafka and Write to HDFS in Avro Format

冷暖自知 提交于 2019-12-08 11:44:46
问题 I basically want to consumes data from Kafka and write it to HDFS. But happens so is that it is not writing any files in hdfs. it create empty files. And also please guide me if i want to write in avro format in hdfs how can i modify the code. For the sake of simplicity am writing to local C drive. import org.apache.spark.SparkConf import org.apache.kafka.common.serialization.StringDeserializer import org.apache.spark.SparkContext import org.apache.spark.streaming.Seconds import org.apache

Avro write and read works on one machine and not on other

▼魔方 西西 提交于 2019-12-08 09:38:37
问题 Here is some Avro code that runs on one machine but fails on the other with an exception. We are not able to make sure what's wrong here. Here is the code that is causing the problem. Class<?> clazz = obj.getClass(); ReflectData rdata = ReflectData.AllowNull.get(); Schema schema = rdata.getSchema(clazz); ByteArrayOutputStream os = new ByteArrayOutputStream(); Encoder encoder = EncoderFactory.get().binaryEncoder(os, null); DatumWriter<T> writer = new ReflectDatumWriter<T>(schema, rdata);

What is the proper way to declare a simple Timestamp in Avro

一曲冷凌霜 提交于 2019-12-08 08:51:28
问题 How can we declare a simple timestamp in Avro please. type:timestamp doesnt work. So I actually use a simple string but I want it as a timestamp. (this is my variable: 27/01/1999 08:45:34 ) Thank you 回答1: Use Avro's logical type: {"name":"timestamp","type": {"type": "string", "logicalType": "timestamp-millis"} Few useful links: Avro timestamp-millis Avro Logical types Hortonworks community question about Avro timestamp 来源: https://stackoverflow.com/questions/55607244/what-is-the-proper-way-to