yarn

Yarn mini-cluster container log directories don't contain syslog files

醉酒当歌 提交于 2020-01-04 15:28:20
问题 I have setup YARN MapReduce mini-cluster with 1 node manager, 4 local and 4 log directories and so on based on hadoop 2.3.0 from CDH 5.1.0. It looks more or less working. What I failed to achieve is syslog logging from containers. I see container log directories, stdout and stderr files but not syslog with MapReduce container logging. Appropriate stderr warns I have no log4j configuration and contains no any other string: log4j:WARN No appenders could be found for logger (org.apache.hadoop

使用Theia——构建你自己的IDE

与世无争的帅哥 提交于 2020-01-04 10:34:29
上一篇: Theia架构 构建你自己的IDE   本指南将教你如何构建你自己的Theia应用。 必要条件   你需要安装node 10版本(译者:事实上最新的node稳定版即可): curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.5/install.sh | bash nvm install 10   以及yarn: npm install -g yarn   还需要确保已安装python 2.x,可通过 python --version 来检查。 安装   首先请创建一个空目录,然后切换到这个目录下: mkdir my-app cd my-app   在这个目录下创建 package.json : { "private": true, "dependencies": { "typescript": "latest", "@theia/typescript": "next", "@theia/navigator": "next", "@theia/terminal": "next", "@theia/outline-view": "next", "@theia/preferences": "next", "@theia/messages": "next", "@theia/git": "next", "

spark-submit through java code

六月ゝ 毕业季﹏ 提交于 2020-01-04 09:38:35
问题 I am trying spark-submit through Java code. I am referring the following example. https://github.com/mahmoudparsian/data-algorithms-book/blob/master/misc/how-to-submit-spark-job-to-yarn-from-java-code.md But I am getting The constructor ClientArguments(String[], SparkConf) is undefined This is my code. import org.apache.spark.deploy.yarn.Client; import org.apache.spark.deploy.yarn.ClientArguments; import org.apache.hadoop.conf.Configuration; import org.apache.spark.SparkConf; public class

Hadoop in LXC container error : YARN : 1/1 local-dirs are bad

我怕爱的太早我们不能终老 提交于 2020-01-04 04:21:10
问题 I have two hadoop instances running inside two lxc containers on the same host, a hadoop-master and a hadoop-slave1. While starting YARN & DFS on master I get this UNHEALTHY state for hadoop-slave1. For what I've found on the web it must be one of these two possibilities: Not enough disk space. Permission issue a. df -h says otherwise : Filesystem Size Used Avail Use% Mounted on /dev/sda5 91G 68G 19G 79% / none 4,0K 0 4,0K 0% /sys/fs/cgroup udev 3,8G 4,0K 3,8G 1% /dev tmpfs 769M 1,3M 768M 1%

Spark(二): spark-submit命令详解

安稳与你 提交于 2020-01-04 00:14:12
spark-submit命令利用可重用的模块形式编写脚本,并且以编程方式提交作业到Spark。 spark-submit命令 spark-submit命令提供一个统一的API把应用程序部署到各种Spark支持的集群管理器上,从而免除了单独配置每个应用程序。 命令行参数 下面逐个介绍这些参数: --master:用于设置主结点URL的参数。 local:用于执行本地机器的代码。Spark运行一个单一的线程,在一个多核机器上,通过local[n]来指定一个具体使用的内核数,n指使用的内核数目,local[*]来指定运行和Spark机器内核一样多的复杂线程。 spark://host:port:这是一个URL和一个Spark单机集群的端口。 mesos://host:port:这是一个URL和一个部署在Mesos的Spark集群的端口。 yarn:作为负载均衡器,用于从运行Yarn的头结点提交作业。 --deploy-mode:允许决定是否在本地(使用client)启动Spark驱动成簇的参数,或者在集群内(使用cluster选项)的其中一台工作机器上启动。默人是client。 --name:应用程序名称。注意,创建SparkSession时,如果是以编程方式指定应用程序名称,那么来自命令行的参数会被重写。 --py-files:.py、.egg或者.zip文件的逗号分隔列表

hadoop log4j not working

妖精的绣舞 提交于 2020-01-03 05:27:08
问题 My jobs are running successfully with Hadoop 2.6.0 but the logger is not working at all I always see log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. yarn-site.xml has the directory with the log4j.properties file listed. I also tried passing it manually via -Dlog4j.configuration option. the file is here:

How to delete yarn logs

徘徊边缘 提交于 2020-01-03 03:33:11
问题 I'm pretty new to YARN. I ran my oozie jobs which creates logs. I can see yarn logs with yarn logs -applicationId application_123456789_12345678 I want to know how I can delete those logs? Can I just delete the file to remove the logs? 回答1: There no yarn commands to delete yarn logs from CLI. You can delete by using Linux rm by going to the yarn log directory yarn.nodemanager.log-dirs /application_${appid} . Individual containers log directories will be below this, in directories named

hadoop 2.2 - datanode doesn't start up

a 夏天 提交于 2020-01-03 02:58:29
问题 I had Hadoop 2.4 this morning (see my previous 2 questions). Now I removed it and installed 2.2 as I had issues with 2.4, and also as I think 2.2 is the latest stable release. Now I followed the tutorial here: http://codesfusion.blogspot.com/2013/10/setup-hadoop-2x-220-on-ubuntu.html?m=1 I am pretty sure I did everything right but I am facing similar issues again. When I run jps it is obvious that the data node is not starting up. What am I doing wrong again? Any help would be much much

3.Hadoop测试Yarn和MapReduce

丶灬走出姿态 提交于 2020-01-02 19:03:32
Hadoop测试Yarn和MapReduce 1.配置Yarn (1)配置ResourceManager 生产环境中,一般是重开一台机器作为ResourceManager,这里我们以Master机器代替。 修改yarn-site.xml: <?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific

Flink TaskManagers do not start until job is submitted in YARN cluster

偶尔善良 提交于 2020-01-02 18:24:30
问题 I am using Amazon EMR to run Flink Cluster on YARN. My setup consists of m4.large instances for 1 master and 2 core nodes. I have started the Flink CLuster on YARN with the command: flink-yarn-session -n 2 -d -tm 4096 -s 4 . Flink Job Manager and Application Manager starts but there are no Task Managers running. The Flink Web interface shows 0 for task managers, task slots and slots available. However when I submit a job to flink cluster, then Task Managers get allocated and the job runs and