mesos

Running Spark driver program in Docker container - no connection back from executor to the driver?

邮差的信 提交于 2020-01-22 10:16:48
问题 UPDATE: The problem is resolved. The Docker image is here: docker-spark-submit I run spark-submit with a fat jar inside a Docker container. My standalone Spark cluster runs on 3 virtual machines - one master and two workers. From an executor log on a worker machine, I see that the executor has the following driver URL: "--driver-url" "spark://CoarseGrainedScheduler@172.17.0.2:5001" 172.17.0.2 is actually the address of the container with the driver program, not the host machine where the

Can't access websocket (ws://) url using marathon-lb

家住魔仙堡 提交于 2020-01-16 18:05:50
问题 I have a container running a jupyter gateway which requires two urls to be accessible a http and websocket url. On localhost, for example, these urls are http://127.0.0.1:8888 and ws://127.0.0.1:8888 . When I start my app with marathon and ssh into the mesos slave running the container I can access both of these urls functionality using my client that makes requests to the jupyter gateway. Which tells me the jupyter gateway in the container is working fine. However when I am trying to access

Can't access websocket (ws://) url using marathon-lb

蓝咒 提交于 2020-01-16 18:05:03
问题 I have a container running a jupyter gateway which requires two urls to be accessible a http and websocket url. On localhost, for example, these urls are http://127.0.0.1:8888 and ws://127.0.0.1:8888 . When I start my app with marathon and ssh into the mesos slave running the container I can access both of these urls functionality using my client that makes requests to the jupyter gateway. Which tells me the jupyter gateway in the container is working fine. However when I am trying to access

持续交付的Mesos与Docker导入篇

纵饮孤独 提交于 2020-01-16 03:45:45
变革这个词在当今的数字化时代司空见惯,IT技术每过一段时间就会有一起革新,从WEB2.0、虚拟化、云计算、大数据、微架构、DevOps再到今天的容器Docker与Mesos。 Docker的出现方便了应用的测试、部署、与升级,其将各种应用程序和它们所依赖的运行环境打包成标准的Container/Image,进而发布到不同的平台上运行。Docker的轻量级、快速部署、迁移方便的特性促进了DevOps的落地,借用容器,开发人员可以很方便的融入到产品的交付流程当中。 Mesos是软件定义数据中心的最佳实践,其理念用最通俗的话来讲就是,让运维人员操作数据中心就算操作一台服务器一样去操作,将数据中心中的CPU、内存、存储等资源统一在一台服务器内进行调度与管理。听起来非常的高大上,如果用技术化的语言来描述,Mesos是这样定义的,统一的集群资源管理与调度平台,将生产环境中的各种服务框架,部署在一个公共的集群中,共享集群中的资源,由Mesos对资源进行统一调度,提供给服务框架使用。Mesos的出现给IaaS、PaaS以及运维的管理提供了极大的便利。 在实践中,Mesos与Docker是最佳的伴侣,前者提供了资源的统一管理,后者实现了资源的隔离使用,有合有分,在不同的层次发挥着不同的效能。同时,Mesos与Docker又都有自己的技术生态圈,两者的生态圈又都相互的促进和推动。

Does Apache Mesos recognize GPU cores?

偶尔善良 提交于 2020-01-13 02:42:10
问题 In slide 25 of this talk by Twitter's Head of Open Source office, the presenter says that Mesos allows one to track and manage even GPU (I assume he meant GPGPU) resources. But I cant find any information on this anywhere else. Can someone please help? Besides Mesos, are there other cluster managers that support GPGPU? 回答1: Mesos does not yet provide direct support for (GP)GPUs, but does support custom resource types. If you specify --resources="gpu(*):8" when starting the mesos-slave, then

【原创】大叔经验分享(98)mesos slave启动失败

允我心安 提交于 2020-01-10 23:42:55
mesos slave启动失败,查看状态如下: # systemctl status mesos-slave ● mesos-slave.service - Mesos Slave Loaded: loaded (/usr/lib/systemd/system/mesos-slave.service; enabled; vendor preset: disabled) Active: activating (auto-restart) (Result: exit-code) since Sat 2019-12-28 21:41:50 CST; 13s ago Process: 15627 ExecStart=/usr/bin/mesos-init-wrapper slave (code=exited, status=1/FAILURE) Main PID: 15627 (code=exited, status=1/FAILURE) Dec 28 21:41:50 test-003 systemd[1]: Unit mesos-slave.service entered failed state. Dec 28 21:41:50 test-003 systemd[1]: mesos-slave.service failed. 查看mesos-slave日志如下: #

Borg/Mesos/Yarn三大主流资源管理与调度系统对比

筅森魡賤 提交于 2020-01-10 08:18:45
转载来自于: Borg/Mesos/Yarn三大主流资源管理与调度系统对比 0. 前言 Mesos(Twitter)、YARN(apache)和Borg(google)三个资源管理与调度系统可以说是目前资源管理和调度系统的先导者,现有的大多数资源管理和调度系统都从这三个系统中吸纳设计思想。对这三个系统的对比总结有助于更好的了解目前资源管理与调度系统的状态和未来的发展趋势。 需要特别说明的是,borg系统所提出的思想直接影响了资源管理和调度系统的发展,例如其提出的在线任务和离线任务混合部署的思路以及资源超售的思路领先行业十余年。直到今天一些业界的系统才开始在混部方面进行探索,而borg早在十多年前就已经提出并在内部系统中进行成熟使用。 对于这三个系统最早出现的应该是borg,其是谷歌内部的资源管理系统,但是一直没有对外公开,直到2015年才发表论文进行说明。接着是Mesos系统,发表于2012年左右,YARN系统发表于2013年,这三个系统无论是在架构设计还是针对的场景、实现思路上都存在较大的差异,下面将针对不同的维度进行阐述。 1. 架构方面 borg架构 Mesos架构 YARN架构 相同之处:三种架构都是基于master/slave架构进行设计的,master主要负责全局的资源分配,slave节点主要负责本节点的各项信息收集、汇报工作。但是具体的实现细节方面还有很大差别。

Error when building Mesos

守給你的承諾、 提交于 2020-01-05 04:10:05
问题 I've been trying to build Apache Mesos on CentOS 7. When I run make I get the following error: Downloading: https://repo.maven.apache.org/maven2/org/apache/apache/11/apache-11.pom [ERROR] [ERROR] Some problems were encountered while processing the POMs: [FATAL] Non-resolvable parent POM for org.apache.mesos:mesos:0.28.2: Could not transfer artifact org.apache:apache:pom:11 from/to central (https://repo.maven.apache.org/maven2): repo.maven.apache.org: unknown error and 'parent.relativePath'

Mesos task - Failed to accept socket: future discarded

跟風遠走 提交于 2020-01-04 14:14:06
问题 I am just trying to upgrade mesos version to 1.3.1 from 1.0.3. Chronos scheduler is able to schedule the JOB thru mesos. The job runs fine and able to see mesos stdout logs. But, still seeing the following in mesos stderr logs. The docker jobs runs fine, but still the status is showing as failed with the below logs. I0905 22:05:00.824811 456 exec.cpp:162] Version: 1.3.1 I0905 22:05:00.829165 459 exec.cpp:237] Executor registered on agent c63c93dc-3d9f-4322-9f82-0553fd1324fe-S0 E0905 22:05:11

How to remove orphaned tasks in Apache Mesos?

老子叫甜甜 提交于 2020-01-04 01:36:18
问题 The problem maybe caused by Mesos and Marathon out of sync, but the solution mentioned on GitHub doesn't work for me. When I found the orphaned tasks: What I do is: restart Marathon Marathon does not sync orphaned tasks, but start new tasks. Orphaned tasks still took the resources , so I have to delete them. I find all orphaned tasks under framework ef169d8a-24fc-41d1-8b0d-c67718937a48-0000 , curl -XGET `http://c196:5050/master/frameworks shows that framework is unregistered_frameworks : {