Cloudera

HBase Shell hangs / freezes

喜夏-厌秋 提交于 2019-12-12 16:05:23
问题 I've installed HBase 0.92.1-cdh4.0.1 on Ubuntu 12.04 in Pseudo-Distributed mode. hbase-master , hbase-regionserver and zookeeper-server are running on this machine; the HDFS is running on another machine (property hbase.rootdir set accordingly). Now I have a problem with the "hbase shell": whenever I submit a create table statement like create 'tbl1', {NAME => 'd', COMPRESSION => 'GZ'} the shell hangs (it does not return anything; waits forever) and I have to kill it with ctrl+c. However the

Spark : check your cluster UI to ensure that workers are registered

被刻印的时光 ゝ 提交于 2019-12-12 07:25:09
问题 I have a simple program in Spark: /* SimpleApp.scala */ import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object SimpleApp { def main(args: Array[String]) { val conf = new SparkConf().setMaster("spark://10.250.7.117:7077").setAppName("Simple Application").set("spark.cores.max","2") val sc = new SparkContext(conf) val ratingsFile = sc.textFile("hdfs://hostname:8020/user/hdfs/mydata/movieLens/ds_small/ratings.csv") //first get the

How to resolve load main class MahoutDriver error on Twenty Newsgroups Classification Example

社会主义新天地 提交于 2019-12-12 04:49:16
问题 I am trying to run the 2newsgroup classification example in Mahout. I have set MAHOUT_LOCAL=true, the classifier doesn't display the Confusion matrix and gives the following warnings : ok. You chose 2 and we'll use naivebayes creating work directory at /tmp/mahout-work-cloudera + echo 'Preparing 20newsgroups data' Preparing 20newsgroups data + rm -rf /tmp/mahout-work-cloudera/20news-all + mkdir /tmp/mahout-work-cloudera/20news-all + cp -R /tmp/mahout-work-cloudera/20news-bydate/20news-bydate

Listing MS SQL Server table in OOZIE via SQOOP Action

放肆的年华 提交于 2019-12-12 04:38:28
问题 I am able to execute the following SQOOP command in CLI perfectly. sqoop list-tables --connect 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db' --connection-manager org.apache.sqoop.manager.SQLServerManager --driver com.microsoft.sqlserver.jdbc.SQLServerDriver -- --schema schma But getting errors while trying out the same in OOZIE (HUE) 2055 [main] ERROR org.apache.sqoop.manager.CatalogQueryManager - Failed to list tables java.sql.SQLException: No

Hue 500 server error

限于喜欢 提交于 2019-12-12 04:11:51
问题 I am creating a MapReduce simple job. After submitting, its giving below error Suggest to fix this issue 回答1: I know I am too late to answer. But I have noticed that this usually gets solved if you clear your cookies. 来源: https://stackoverflow.com/questions/37207387/hue-500-server-error

Result of hdfs dfs -ls command

只谈情不闲聊 提交于 2019-12-12 03:48:09
问题 In the execution of hdfs dfs -ls command I wuold like to know if the result are all the files stored in the cluster or just the partitions in the node where it is executed. I'm a newby in hadoop and I´m having some problems serching the partitions in each node. Thank you 回答1: Question: "...if the result are all the files stored in the cluster or..." What you see from ls command are all the files stored in the cluster. More specifically, what you see is a bunch of file paths and names. These

Error message while copy file from LocalFile to hdfs

Deadly 提交于 2019-12-12 03:15:20
问题 I tried to copy file from local to hdfs. Using the command hadoop dfs -copyFromLocal in/ /user/hduser/hadoop The following error message shown. Please help to find the problem. DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. 15/02/02 19:22:23 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hduser/hadoop._COPYING_ could only be replicated to 0 nodes instead of

wordcount not running in Cloudera

主宰稳场 提交于 2019-12-12 02:14:15
问题 I have installed Cloudera 5.8 in a Linux RHEL 7.2 instance of Amazon EC2. I have logged in with SSH and I am trying to run the wordcount example for testing mapreduce operation with the following command: hadoop jar /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar wordcount archivo.txt output The problem is that the wordcount program is blocked and it not produces the output. Only the following is prompted: 16/08/11 13:10:02 INFO client

spark-submit to cloudera cluster can not find any dependent jars

北慕城南 提交于 2019-12-12 02:04:09
问题 I am able to do a spark-submit to my cloudera cluster. the job dies after a few minutes with exceptions complaining it can not find various classes. These are classes that are in the spark dependency path. I keep adding the jars one at a time using command line args --jars, the yarn log keeps dumping out the next jar it can't find. What setting allows the spark/yarn job to find all the dependent jars? I already set the "spark.home" attribute to the correct path - /opt/cloudera/parcels/CDH/lib

Cloudera - JAVA_HOME not set

假装没事ソ 提交于 2019-12-12 00:20:05
问题 I am pretty novice when it comes to Ubuntu but I am trying to follow along with the install instructions for Cloudera located here. At step 1, I am getting to following error: brock@brock-hpserver:~$ sudo -u hdfs hdfs namenode -format Error: JAVA_HOME is not set and could not be found. However, although I could be wrong, I believe I have everything set up properly: brock@brock-hpserver:~$ echo $JAVA_HOME /usr/lib/jvm/java-6-openjdk-amd64 brock@brock-hpserver:~$ echo $PATH /usr/lib/lightdm