mesos

mesos学习之cgroup

99封情书 提交于 2019-12-04 12:32:42
最近在啃mesos源码,mesos用cgroup做资源隔离,由于之前没有接触linux container,所以写了一些小代码做了一些小试验来学习一下cgroup。 /proc/mounts文件是以/etc/mtab文件的格式给出当前系统所挂载的文件系统信息,这个文件也能反映出任何手工安装从而在/etc/mtab文件中没有包含的文件系统。当挂载cgroups后,cgroups的挂载点的信息也出现在/proc/mounts中,在我机器上/proc/mounts的条目信息如下: 从左至右的信息是:文件系统name,挂载点绝对路径,文件系统类型,选项,dump的频率和fsck的检查次数。 我在/home/test_dir/cgroups目录下挂载了cgroup的cpu和memory子系统,在/home/test_dir/cgroups2目录下挂载cgroup的net_cls子系统。 输入如下两行命令: 以在/proc/mounts中显示分别显示出了这两个目录。 /proc/cgroups记录着所有cgroup子系统的状态: 从左到右的条目分别是子系统name,hierarchy ID,子系统的cgroup控制组数目,子系统是否可用(1可用,0不可用) 那么这两个文件可以用来干嘛呢? 我们可以通过查看/proc/cgroups是否存在来判断cgroup在机器上是否可用。可以通过解析

Accessing HDFS HA from spark job (UnknownHostException error)

烈酒焚心 提交于 2019-12-04 12:17:31
问题 I have Apache Mesos 0.22.1 cluster (3 masters & 5 slaves), running Cloudera HDFS (2.5.0-cdh5.3.1) in HA configuration and Spark 1.5.1 framework. When I try to spark-submit compiled HdfsTest.scala example app (from Spark 1.5.1 sources) - it fails with java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfs error in executor logs. This error is only observed when I pass HDFS HA Path as an argument hdfs://hdfs/<file> , when I pass hdfs://namenode1.hdfs.mesos:50071/tesfile -

Does Apache Mesos recognize GPU cores?

佐手、 提交于 2019-12-04 07:58:08
In slide 25 of this talk by Twitter's Head of Open Source office, the presenter says that Mesos allows one to track and manage even GPU (I assume he meant GPGPU) resources. But I cant find any information on this anywhere else. Can someone please help? Besides Mesos, are there other cluster managers that support GPGPU? Mesos does not yet provide direct support for (GP)GPUs, but does support custom resource types. If you specify --resources="gpu(*):8" when starting the mesos-slave, then this will become part of the resource offer to frameworks, which can launch tasks that claim to use these

Linked Docker Containers with Mesos/Marathon

不问归期 提交于 2019-12-04 07:35:38
问题 I'm having great success so far using Mesos, Marathon, and Docker to manage a fleet of servers, and the containers I'm placing on them. However, I'd now like to go a bit further and start doing things like automatically linking an haproxy container to each main docker service that starts, or provide other daemon based and containerized services that are linked and only available to the single parent container. Normally, I'd start up the helper service first with some name, then when I started

How should a .dockercfg file be hosted in a Mesosphere-on-AWS setup so that only Mesosphere can use it?

百般思念 提交于 2019-12-04 07:00:20
We have set up a test cluster with Mesosphere on AWS, in a private VPC. We have some Docker images which are public, which are easy enough to deploy. However most of our services are private images, hosted on the Docker Hub private plan, and require authentication to access. Mesosphere is capable of private registry authentication, but it achieves this in a not-exactly-ideal way: a HTTPS URI to a .dockercfg file needs to be specified in all Mesos/Marathon task definitions. As the title suggests, the question is basically: how should the .dockercfg file be hosted within AWS so that access may

How to run a one-off task with Apache Mesos/Marathon?

三世轮回 提交于 2019-12-04 02:01:50
I'm trying to run a one-off task with Marathon. I'm able to get the task container running, but after the task command completes, marathon runs another task, and so on. How can I prevent Marathon from running more than one task/command? Or, if this is not possible with Marathon, how can I achieve the desired behaviour? Mik As a hack you can kill a marathon task at the end, as suggested here: https://github.com/mesosphere/marathon/issues/344#issuecomment-86697361 As rukletsov already mentioned - Marathon is desigend for long-running tasks: https://stackoverflow.com/a/26647789/1047843 If Chronos

Understanding resource allocation for spark jobs on mesos

淺唱寂寞╮ 提交于 2019-12-03 14:52:47
I'm working on a project in Spark, and recently switched from using Spark Standalone to Mesos for cluster management. I now find myself confused about how to allocate resources when submitting a job under the new system. In standalone mode, I was using something like this (following some recommendations from this Cloudera blog post : /opt/spark/bin/spark-submit --executor-memory 16G --executor-cores 8 --total-executor-cores 240 myscript.py This is on a cluster where each machine has 16 cores and ~32 GB RAM. What was nice about this is that I had nice control over the number of executors

Transport Endpoint Not Connected - Mesos Slave / Master

匿名 (未验证) 提交于 2019-12-03 01:49:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I'm trying to connect a Mesos slave to its master. Whenver the slave tries to connect to the master, I get the following message: I0806 16:39:59.090845 935 hierarchical.hpp:528] Added slave 20150806-163941-1027506442-5050-921-S3 (debian) with cpus(*):1; mem(*):1938; disk(*):3777; ports(*):[31000-32000] (allocated: ) E0806 16:39:59.091384 940 socket.hpp:107] Shutdown failed on fd=25: Transport endpoint is not connected [107] I0806 16:39:59.091508 940 master.cpp:3395] Registered slave 20150806-163941-1027506442-5050-921-S3 at slave(1)@127.0.1

Apache Chronos Architecture Explaination

心不动则不痛 提交于 2019-12-02 15:10:49
问题 I was trying to see what makes Chronos better than Crons? I am not able to understand its job scheduling and executing architecture completely. Specifically, these are the questions around chronos architecture that are not clear to me. In one of the Chronos documentation I read that since crons has SPoF, crons are bad and cronos is better. How chronos avoids SPoF? Where are job schedules saved in Chronos? Does it maintain some sort of DB for that? How scheduled jobs are triggered, who sends

Linked Docker Containers with Mesos/Marathon

笑着哭i 提交于 2019-12-02 14:31:58
I'm having great success so far using Mesos, Marathon, and Docker to manage a fleet of servers, and the containers I'm placing on them. However, I'd now like to go a bit further and start doing things like automatically linking an haproxy container to each main docker service that starts, or provide other daemon based and containerized services that are linked and only available to the single parent container. Normally, I'd start up the helper service first with some name, then when I started the real service, I'd link it to the helper and everything would be fine. How does this model fit in