Cloudera

Invalid URI for NameNode address

雨燕双飞 提交于 2019-12-07 03:24:04
问题 I'm trying to set up a Cloudera Hadoop cluster, with a master node containing the namenode , secondarynamenode and jobtracker , and two others nodes containing the datanode and tasktracker . The Cloudera version is 4.6, the OS is ubuntu precise x64. Also, this cluster is being created from a AWS instance. ssh passwordless has been set as well, Java instalation Oracle-7. Whenever I execute sudo service hadoop-hdfs-namenode start I get: 2014-05-14 05:08:38,023 FATAL org.apache.hadoop.hdfs

Hadoop

跟風遠走 提交于 2019-12-07 00:37:22
附上: 喵了个咪的博客: w-blog.cn cloudera官网: https://www.cloudera.com/ 官方文档地址: https://www.cloudera.com/documentation/enterprise/latest.html 一 , 监控 可以在管理页面看到默认的监控 点击进入莫个具体的组件 也有与之对应的监控指标 二, 自定义监控 可以在管理菜单上方的图表里面找到dashboards 在具体的页面可以添加和导入仪表盘 通过拖动就可以打造出自己的监控大屏 如果需要增加监控指标可以在操作菜单点击<从图表生成器增加> 通过语句可以简单的进行搜索 关于图表的类型也可以自己定义,指标也可以自由定义 来源: oschina 链接: https://my.oschina.net/u/2394822/blog/1942314

CDH6.0.1集成tez-0.9.1计算引擎

此生再无相见时 提交于 2019-12-06 19:24:59
参考文章: https://www.jianshu.com/p/9fb9f32e1f0f https://www.baidu.com/link?url=OgpwasnZi7H1dySN2T111sseEWDBaCCTC3DFV61G7756YbrkJCA8Y3UFaueyqnfN&wd=&eqid=daeb8b3500049cf3000000065d82fcbc http://tez.apache.org/releases/apache-tez-0-9-1.html 准备工作: hadoop版本:3.0.0-cdh6.0.1 hive版本:hive-2.1.1 linux环境: jdk1.8 maven-3.6 protobuf-2.5.0.tar.gz // https://github.com/protocolbuffers/protobuf/releases tez-0.9.1源码 // http://www.apache.org/dyn/closer.lua/tez/0.9.1/ windows环境: jdk1.8 maven-3.3.9 protoc-2.5.0-win32.zip // https://github.com/protocolbuffers/protobuf/releases tez-0.9.1源码 // http://www.apache.org/dyn

Ubuntu 16.04 安装Cloudera Manager 5.15.1

旧时模样 提交于 2019-12-06 16:35:00
https://www.jianshu.com/p/012739e132f6 三个错误: 1.jdk安装在/usr目录下 2.启动agent时报错/lib/x86_64-linux-gnu/libcrypto.so.1.0.0: version `OPENSSL_1.0.2' not found 升级openssl到1.0.2 https://blog.csdn.net/swjtu100/article/details/52093261 搜索到libcrypto.so.1.0.2,复制到/lib/x86_64-linux-gnu并改文件名为1.0.0 3.创建scm表时报错 https://blog.csdn.net/chenli195/article/details/71404355?utm_source=blogxgwz1 来源: https://www.cnblogs.com/grow1016/p/11994624.html

hadoop -libjars and ClassNotFoundException

浪子不回头ぞ 提交于 2019-12-06 13:08:50
问题 please help, I'm stuck. Here is my code to run job. hadoop jar mrjob.jar ru.package.Main -files hdfs://0.0.0.0:8020/MyCatalog/jars/metadata.csv -libjars hdfs://0.0.0.0:8020/MyCatalog/jars/opencsv.jar,hdfs://0.0.0.0:8020/MyCatalog/jars/gson.jar,hdfs://0.0.0.0:8020/MyCatalog/jars/my-utils.jar /MyCatalog/http_requests.seq-r-00000 /MyCatalog/output/result_file I do get these WARNs: 12/10/26 18:35:50 WARN util.GenericOptionsParser: The libjars file hdfs://0.0.0.0:8020/MyCatalog/jars/opencsv.jar is

Get Line number in map method using FileInputFormat

∥☆過路亽.° 提交于 2019-12-06 10:24:40
I was wondering whether it is possible to get the line number in my map method? My input file is just a single column of values like, Apple Orange Banana Is it possible to get key: 1, Value: Apple , Key: 2, Value: Orange ... in my map method? Using CDH3/CDH4. Changing the input data so as to use KeyValueInputFormat is not an option. Thanks ahead. The default behaviour of InputFormats such as TextInputFormat is to give the byte offset of the record rather than the actual line number - this is mainly due to being unable to determine the true line number when an input file is splittable and being

ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF _DIR is set correctly

大憨熊 提交于 2019-12-06 07:59:28
问题 I am trying to import data from sqoop to hive MySQL use sample; create table forhive( id int auto_increment, firstname varchar(36), lastname varchar(36), primary key(id) ); insert into forhive(firstname, lastname) values("sample","singh"); select * from forhive; 1 abhay agrawal 2 vijay sharma 3 sample singh This is the Sqoop command I'm using (version 1.4.7) sqoop import --connect jdbc:mysql://********:3306/sample --table forhive --split-by id --columns id,firstname,lastname --target-dir

Yarn: How to utilize full cluster resources?

删除回忆录丶 提交于 2019-12-06 06:11:21
问题 So I am having a cloudera cluster with 7 worker nodes. 30GB RAM 4 vCPUs Here are some of my configurations which I found important (from Google) in tuning performance of my cluster. I am running with: yarn.nodemanager.resource.cpu-vcores => 4 yarn.nodemanager.resource.memory-mb => 17GB (Rest reserved for OS and other processes) mapreduce.map.memory.mb => 2GB mapreduce.reduce.memory.mb => 2GB Running nproc => 4 (Number of processing units available) Now my concern is, when I look at my

How do I view my Hadoop job history and logs using CDH4 and Yarn?

↘锁芯ラ 提交于 2019-12-06 05:51:28
问题 I downloaded the CDH4 tar for Hadoop with Yarn, and jobs are running fine, but I can't figure out where to view the logs from my job. In MRv1, I simply went to the JobTracker web app, and it had the job history. Individual jobs' logs were accessible from here as well, or by going to logs/userlogs directory. In my new Yarn setup (just running on single computer), I have the logs directory, but no logs/userlogs folder. When I go to the ResourceManager web page, localhost:8088, there is an "All

How to set configuration in Hive-Site.xml file for hive metastore connection?

醉酒当歌 提交于 2019-12-06 05:01:16
问题 I want to connect MetaStore using the java code. I have no idea how to set configuration setting in Hive-Site.xml file and where I'll post the Hive-Site.xml file. Please help. import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.Statement; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.conf.HiveConf.ConfVars; public class HiveMetastoreJDBCTest { public static void main(String[] args)