问题
I created the the below script in pig. I am pretty new to PIG and PIGLATIN. I am still learning how to use PIG scripts efficiently.
Upon executing the script I got this error:
Error ERROR [main] org.apache.pig.tools.grunt.Grunt - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException
Can somebody please explain the reason and how I can correct it. In the csv file I have all char columns except the rate column which has integer values.
*divs = LOAD 'output\file.csv' using PigStorage(',') AS (uniID:chararray, deal:chararray, rol: chararray,name:chararray,add:chararray,city:chararray,stat:chararray,stn:chararray,zip:chararray,country:chararray,db:chararray,sm:chararray,rate:int);
DUMP divs;
trimmed = foreach divs generate sm,uniID,rol,rate,country;
DUMP trimmed;
grpd = group trimmed by sm;
orderd = order trimmed by country;
describe trimmed;
describe grpd;
DUMP grpd;
describe orderd;
avgdiv = foreach grpd generate sm, AVG(divs.rate), SUM(divs.rate), MAX(divs.rate);
DUMP avgdiv;
store avgdiv into 'output/pigdescribe1out';
explain;*
回答1:
Your group statement returns an error. You are trying to aggregate the data before grouping, hence the error.
divs = LOAD '$input' using PigStorage('^A') AS (uniID:chararray, deal:chararray, rol: chararray,name:chararray,add:chararray,city:chararray,stat:chararray,stn:chararray,zip:chararray,country:chararray,db:chararray,sm:chararray,rate:int);<br/>
DUMP divs;
trimmed = foreach divs generate sm,uniID,rol,rate,country;
DUMP trimmed;
grpd = group trimmed by sm;
orderd = order trimmed by country;
describe trimmed;
describe grpd;
DUMP grpd;
describe orderd;
avgdiv = foreach grpd generate FLATTEN(group), AVG(trimmed.rate), SUM(trimmed.rate), MAX(trimmed.rate);
DUMP avgdiv;
store avgdiv into 'data/sampledata/';
explain;
This works perfectly fine.
来源:https://stackoverflow.com/questions/12874975/error-main-2997-unable-to-recreate-exception-from-backend-error-org-apache-p