elephantbird

ElephantBird ERROR 1070: — > class not getting read

江枫思渺然 提交于 2019-12-25 08:28:17
问题 My problem is similar to this unanswered question : [https://stackoverflow.com/questions/42140344/elephantbird-dependency-jars][1] i have registered all jars mandatory for elephantbird to function. REGISTER '/MyJARS/elephant-bird-hadoop-compat-4.1 REGISTER '/MyJARS/json-simple-1.1.jar'; REGISTER '/MyJARS/elephant-bird-pig-4.1.jar'; REGISTER '/MyJARS/elephant-bird-core-4.10.jar'; REGISTER '/MyJARS/google-collections-1.0.jar'; following links tell me these info : 1 : Loading data from HDFS does

elephantbird registered still showing error 2998

a 夏天 提交于 2019-12-24 01:57:05
问题 grunt> register '/home/piyush/Desktop/pro/json-simple-1.1.1.jar' grunt> register '/home/piyush/Desktop/pro/elephant-bird-pig-4.1.jar' grunt> register '/home/piyush/Desktop/pro/elephant-bird-hadoop-compat-4.1.jar' grunt> register '/home/piyush/Desktop/pro/elephant-bird-core-4.1.jar' grunt> load_tweets = LOAD '/home/piyush/Desktop/pro/quattr.txt' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') AS myMap; 2017-01-26 07:16:29,631 [main] ERROR org.apache.pig.tools.grunt.Grunt -

ElephantBird package build failure:

不羁岁月 提交于 2019-12-23 02:05:21
问题 I downloaded ElephantBird source and tried to build by running "mvn package" but I am getting the following error: [ERROR] Failed to execute goal com.github.igor-petruk.protobuf:protobuf-maven-plugin:0.4:run (default) on project elephant-bird-core: Unable to find 'protoc' -> [Help 1] I am using mvn version 3.0.3 and I tried in the Mac and Ubuntu but I got the same error. EDIT1: Thanks to Lorand's comments, I resolved the above problem by upgrading the protocol buffer. I also installed Thrift

ElephantBird package build failure:

风流意气都作罢 提交于 2019-12-23 02:05:20
问题 I downloaded ElephantBird source and tried to build by running "mvn package" but I am getting the following error: [ERROR] Failed to execute goal com.github.igor-petruk.protobuf:protobuf-maven-plugin:0.4:run (default) on project elephant-bird-core: Unable to find 'protoc' -> [Help 1] I am using mvn version 3.0.3 and I tried in the Mac and Ubuntu but I got the same error. EDIT1: Thanks to Lorand's comments, I resolved the above problem by upgrading the protocol buffer. I also installed Thrift

Cannot query example AddressBook protobuf data in hive with elephant-bird

一笑奈何 提交于 2019-12-13 19:15:58
问题 I'm trying to use elephant bird to query some example protobuf data. I'm using the AddressBook example, and I serialized a handful of fake AddressBooks into files and put them in hdfs under /user/foo/data/elephant-bird/addressbooks/ The query returns no results I setup the table and query like so: add jar /home/foo/downloads/elephant-bird/hadoop-compat/target/elephant-bird-hadoop-compat-4.6-SNAPSHOT.jar; add jar /home/foo/downloads/elephant-bird/core/target/elephant-bird-core-4.6-SNAPSHOT.jar

Pig: Create json file with actual key_name and values

折月煮酒 提交于 2019-12-13 07:29:30
问题 I have a pig script using elephant bird json loader. data_input = LOAD '$DATA_INPUT' USING com.twitter.elephantbird.pig.load.JsonLoader() AS (json:map []); x = FOREACH data_input GENERATE json#'user__id_str', json#'user__created_at', json#'user__notifications', json#'user__follow_request_sent', json#'user__friends_count', json#'user__name', json#'user__time_zone', json#'user__profile_background_color', json#'user__is_translation_enabled', json#'user__profile_link_color', json#'user__utc

How to load a file with a JSON array per line in Pig Latin

霸气de小男生 提交于 2019-12-11 21:45:12
问题 An existing script creates text files with an array of JSON objects per line, e.g., [{"foo":1,"bar":2},{"foo":3,"bar":4}] [{"foo":5,"bar":6},{"foo":7,"bar":8},{"foo":9,"bar":0}] … I would like to load this data in Pig, exploding the arrays and processing each individual object. I have looked at using the JsonLoader in Twitter’s Elephant Bird to no avail. It doesn’t complain about the JSON, but I get “Successfully read 0 records” when running the following: register '/tmp/elephant-bird/core

How do I split in Pig a tuple of many maps into different rows

喜夏-厌秋 提交于 2019-12-10 23:41:40
问题 I have a relation in Pig that looks like this: ([account_id#100, timestamp#1434, id#900], [account_id#100, timestamp#1434, id#901], [account_id#100, timestamp#1434, id#902]) As you can see, I have three map objects within a tuple. All of the data above is within the $0'th field in the relation. So the data above in a relation with a single bytearray column. The data is loaded as follows: data = load 's3://data/data' using com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad'); DESCRIBE

Json parse with elephantbird in Pig

余生颓废 提交于 2019-12-10 20:30:10
问题 I can't get the following data to parse in Pig. It's what the twitter API returns after getting all tweets from a certain user. source data: (I removed some numbers to not invade on anyone's privacy by accident) [{"created_at":"Sat Nov 01 23:15:45 +0000 2014","id":5286804225,"id_str":"5286864225","text":"@Beace_ your nan makes me laugh with some of the things she comes out with","source":"\u003ca href=\"http:\/\/twitter.com\/download\/iphone\" rel=\"nofollow\"\u003eTwitter for iPhone\u003c\/a

Elephant-bird mvn package error

只愿长相守 提交于 2019-12-07 04:26:42
问题 I have installed hadoop 2.2 in my system. I want to use Elephant-Bird jar. Am getting following error while runnning "mvn package". Error: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile) on project elephant-bird-core: Compilation failure: Compilation failure: [ERROR] /usr/lib/hadoop/elephant_bird/core/target/generated-sources/thrift/com/twitter/elephantbird/thrift/test/TestListInList.java: [9,39] error: package org.apache.commons