问题
I downloaded and installed the VM Cloudera 4.4 to play with Hadoop. I have already a cluster on a platform for my job, so I know a little how works hadoop. So I think my problem comes from my misunderstanding of linux and his users and group.
With Hive :
I try to create a hive table with the shell, and it works. I have a table in /user/hive/warehouse/test witch belongs to user cloudera of group cloudera.
I have some data files (.txt) in hdfs : /user/cloudera ( user:cloudera and group: hive) that I load in my hive table with :
LOAD DATA INPATH '/user/cloudera/*.txt' INTO TABLE test;
This is what I obtained :
hive> LOAD DATA INPATH '/user/cloudera/jeuDeTest/*.txt' INTO TABLE test;
Loading data to table default.test
chgrp: changing ownership of '/user/hive/warehouse/test/_log24310.txt': User does not belong to hive
chgrp: changing ownership of '/user/hive/warehouse/test/_log24311.txt': User does not belong to hive
Table default.test stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 10161843, raw_data_size: 0]
OK
Time taken: 2.472 seconds
I never had this kind of error message but the files are moved. If I try a SELECT *
, there is no result.
With HBase :
I have also some difficulties with HBase. I can create a table but when I use importTSV :
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
-Dimporttsv.columns=HBASE_ROW_KEY,cf:nl,ch:nt,cf:ti,cf:ip,cf:cr,cf:am,cf:op,cf:mr,cf:ct
'-Dimporttsv.separator=|' testhbase -Dimporttsv.skip.bad.lines=false
/user/cloudera/jeuDeTest/*.txt
I have this error :
ERROR security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:SIMPLE)
cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist:
hdfs://localhost.localdomain:8020/user/cloudera/jeuDeTest/_logGeneral_C_24310_SO.txt
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist:
hdfs://localhost.localdomain:8020/user/cloudera/jeuDeTest/_logGeneral_C_24310_SO.txt
I think this problems are due to permissions but I don't know how to do have the right to execute request, what is the better way to do that. (On the platform I have at work, I am root, and I don't have all this difficulties, but I don't understand how it works)
Thank you for reading me.
Angelik
I try to add my cloudera user to the group hive. I don't have the error during the load but I have always no result on a select.
hive> LOAD DATA INPATH '/user/cloudera/jeuDeTest/*.txt' INTO TABLE test;
Loading data to table default.test
Table default.test stats: [num_partitions: 0, num_files: 10, num_rows: 0, total_size: 10161843, raw_data_size: 0]
OK
Time taken: 0.486 seconds
hive> select * from test limit 20;
OK
Time taken: 0.303 seconds
回答1:
I had same issue with permissions -> chgrp: changing ownership of '/user/hive/warehouse/test/_log24310.txt': User does not belong to hive.
- Added the existing user named cloudera to existing group named hive with command: usermod -a -G hive cloudera
- Restarted the system
- Used Load Command and after that did a select * from table_name -> No data was getting displayed.
- Executed select count(*) from table_name and a MapReduce job got started.
- Executed select * from table and now results was returned correctly.
- Opened a impala shell using impala-shell command.
- Executed a select * from table_name and no results was getting returned.
- Executed command invalidate metadata in the impala-shell
- Executed command refresh
table_name
- Executed command show tables
- Executed command select * from table_name and now results are getting displayed both in the impala-shell and hive shell.
来源:https://stackoverflow.com/questions/21605960/vm-cloudera-user-cloudera-and-permissions