Cloudera | 易学教程

HBase Shell hangs / freezes

阅读更多关于 HBase Shell hangs / freezes

问题 I've installed HBase 0.92.1-cdh4.0.1 on Ubuntu 12.04 in Pseudo-Distributed mode. hbase-master , hbase-regionserver and zookeeper-server are running on this machine; the HDFS is running on another machine (property hbase.rootdir set accordingly). Now I have a problem with the "hbase shell": whenever I submit a create table statement like create 'tbl1', {NAME => 'd', COMPRESSION => 'GZ'} the shell hangs (it does not return anything; waits forever) and I have to kill it with ctrl+c. However the

Spark : check your cluster UI to ensure that workers are registered

阅读更多关于 Spark : check your cluster UI to ensure that workers are registered

问题 I have a simple program in Spark: /* SimpleApp.scala */ import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object SimpleApp { def main(args: Array[String]) { val conf = new SparkConf().setMaster("spark://10.250.7.117:7077").setAppName("Simple Application").set("spark.cores.max","2") val sc = new SparkContext(conf) val ratingsFile = sc.textFile("hdfs://hostname:8020/user/hdfs/mydata/movieLens/ds_small/ratings.csv") //first get the

How to resolve load main class MahoutDriver error on Twenty Newsgroups Classification Example

阅读更多关于 How to resolve load main class MahoutDriver error on Twenty Newsgroups Classification Example

问题 I am trying to run the 2newsgroup classification example in Mahout. I have set MAHOUT_LOCAL=true, the classifier doesn't display the Confusion matrix and gives the following warnings : ok. You chose 2 and we'll use naivebayes creating work directory at /tmp/mahout-work-cloudera + echo 'Preparing 20newsgroups data' Preparing 20newsgroups data + rm -rf /tmp/mahout-work-cloudera/20news-all + mkdir /tmp/mahout-work-cloudera/20news-all + cp -R /tmp/mahout-work-cloudera/20news-bydate/20news-bydate

Listing MS SQL Server table in OOZIE via SQOOP Action

阅读更多关于 Listing MS SQL Server table in OOZIE via SQOOP Action

问题 I am able to execute the following SQOOP command in CLI perfectly. sqoop list-tables --connect 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db' --connection-manager org.apache.sqoop.manager.SQLServerManager --driver com.microsoft.sqlserver.jdbc.SQLServerDriver -- --schema schma But getting errors while trying out the same in OOZIE (HUE) 2055 [main] ERROR org.apache.sqoop.manager.CatalogQueryManager - Failed to list tables java.sql.SQLException: No

Hue 500 server error

阅读更多关于 Hue 500 server error

问题 I am creating a MapReduce simple job. After submitting, its giving below error Suggest to fix this issue 回答1: I know I am too late to answer. But I have noticed that this usually gets solved if you clear your cookies. 来源： https://stackoverflow.com/questions/37207387/hue-500-server-error

Result of hdfs dfs -ls command

阅读更多关于 Result of hdfs dfs -ls command

问题 In the execution of hdfs dfs -ls command I wuold like to know if the result are all the files stored in the cluster or just the partitions in the node where it is executed. I'm a newby in hadoop and I´m having some problems serching the partitions in each node. Thank you 回答1: Question: "...if the result are all the files stored in the cluster or..." What you see from ls command are all the files stored in the cluster. More specifically, what you see is a bunch of file paths and names. These

Error message while copy file from LocalFile to hdfs

阅读更多关于 Error message while copy file from LocalFile to hdfs

问题 I tried to copy file from local to hdfs. Using the command hadoop dfs -copyFromLocal in/ /user/hduser/hadoop The following error message shown. Please help to find the problem. DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. 15/02/02 19:22:23 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hduser/hadoop._COPYING_ could only be replicated to 0 nodes instead of

wordcount not running in Cloudera

阅读更多关于 wordcount not running in Cloudera

问题 I have installed Cloudera 5.8 in a Linux RHEL 7.2 instance of Amazon EC2. I have logged in with SSH and I am trying to run the wordcount example for testing mapreduce operation with the following command: hadoop jar /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount archivo.txt output The problem is that the wordcount program is blocked and it not produces the output. Only the following is prompted: 16/08/11 13:10:02 INFO client

spark-submit to cloudera cluster can not find any dependent jars

阅读更多关于 spark-submit to cloudera cluster can not find any dependent jars

问题 I am able to do a spark-submit to my cloudera cluster. the job dies after a few minutes with exceptions complaining it can not find various classes. These are classes that are in the spark dependency path. I keep adding the jars one at a time using command line args --jars, the yarn log keeps dumping out the next jar it can't find. What setting allows the spark/yarn job to find all the dependent jars? I already set the "spark.home" attribute to the correct path - /opt/cloudera/parcels/CDH/lib

Cloudera - JAVA_HOME not set

阅读更多关于 Cloudera - JAVA_HOME not set

问题 I am pretty novice when it comes to Ubuntu but I am trying to follow along with the install instructions for Cloudera located here. At step 1, I am getting to following error: brock@brock-hpserver:~$ sudo -u hdfs hdfs namenode -format Error: JAVA_HOME is not set and could not be found. However, although I could be wrong, I believe I have everything set up properly: brock@brock-hpserver:~$ echo $JAVA_HOME /usr/lib/jvm/java-6-openjdk-amd64 brock@brock-hpserver:~$ echo $PATH /usr/lib/lightdm