cloudera-cdh

How to run MapReduce tasks in Parallel with hadoop 2.x?

喜夏-厌秋 提交于 2021-02-07 19:09:58
问题 I would like my map and reduce tasks to run in parallel. However, despite trying every trick in the bag, they are still running sequentially. I read from How to set the precise max number of concurrently running tasks per node in Hadoop 2.4.0 on Elastic MapReduce, that using the following formula, one can set the number of tasks running in parallel. min (yarn.nodemanager.resource.memory-mb / mapreduce.[map|reduce].memory.mb, yarn.nodemanager.resource.cpu-vcores / mapreduce.[map|reduce].cpu

Impala: Show tables like query

南楼画角 提交于 2021-02-07 14:45:47
问题 I am working with Impala and fetching the list of tables from the database with some pattern like below. Assume i have a Database bank , and tables under this database are like below. cust_profile cust_quarter1_transaction cust_quarter2_transaction product_cust_xyz .... .... etc Now i am filtering like show tables in bank like '*cust*' It is returning the expected results like, which are the tables has a word cust in its name. Now my requirement is i want all the tables which will have cust

How to use Scala implicit class in Java

一个人想着一个人 提交于 2021-01-27 04:49:57
问题 I have a Scala Implicit class from RecordService API, which i wanted to use in Java file. package object spark { implicit class RecordServiceContext(ctx: SparkContext) { def recordServiceTextFile(path: String) : RDD[String] = { new RecordServiceRDD(ctx).setPath(path) .map(v => v(0).asInstanceOf[Text].toString) } } } Now i am trying to import this in a Java file using below import. import com.cloudera.recordservice.spark.*; But i am not able to use recordServiceTextFile("path") from

webhdfs rest api throwing file not found exception

旧城冷巷雨未停 提交于 2020-07-24 03:49:29
问题 I am trying to open a hdfs file that is present on cdh4 cluster from cdh5 machine using webhdfs from the command line as below: curl -i -L "http://namenodeIpofCDH4:50070/webhdfs/v1/user/quad/source/JSONML.java?user.name=quad&op=OPEN" I am getting "File Not Found Exception" even if the file JSONML.java is present in the mentioned path in namenode as well as datanode and its trace is as follows: HTTP/1.1 307 TEMPORARY_REDIRECT Cache-Control: no-cache Expires: Thu, 01-Jan-1970 00:00:00 GMT Date:

webhdfs rest api throwing file not found exception

回眸只為那壹抹淺笑 提交于 2020-07-24 03:46:30
问题 I am trying to open a hdfs file that is present on cdh4 cluster from cdh5 machine using webhdfs from the command line as below: curl -i -L "http://namenodeIpofCDH4:50070/webhdfs/v1/user/quad/source/JSONML.java?user.name=quad&op=OPEN" I am getting "File Not Found Exception" even if the file JSONML.java is present in the mentioned path in namenode as well as datanode and its trace is as follows: HTTP/1.1 307 TEMPORARY_REDIRECT Cache-Control: no-cache Expires: Thu, 01-Jan-1970 00:00:00 GMT Date:

Is there any alternative for Cloudera Manager? (CDH) [closed]

荒凉一梦 提交于 2020-07-22 13:03:49
问题 Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 4 days ago . Improve this question As Cloudera official blog said, there is no free version of CDH from 6.3.3, they would make the Cloudera Manager to open source, but not yet. Is there any other project like Cloudera Manager? which can manage Hadoop components by Web UI, especially belongs to

FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. com/yammer/metrics/core/MetricsRegistry

妖精的绣舞 提交于 2020-04-17 22:12:08
问题 We facing some issue in beeline while we connecting via beeline to hbase table. We have two hiveserver2, one of the node we got this error like: INFO : Query ID = hive_20190719154444_babd2ce5-4d41-400b-9be5-313acaffc9bf INFO : Total jobs = 1 INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-0:MAPRED] in serial mode INFO : Number of reduce tasks is set to 0 since there's no reduce operator ERROR : FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr