Is Hadoop in Docker container faster/worth it? [closed]

感情迁移 提交于 2019-12-06 04:06:07

Is it maybe faster or why is worth it?

It sounds like you already have a Hadoop cluster. So you have to ask yourself, how long does it take to reproduce this environment? How often do you need to reproduce this environment?

If you are not needing a way to reproduce the environment repeatedly and and contain dependencies that may be conflicts with other applications on the host, then I don't yet see a use case for you.

What are advantages?

If you are running Hadoop in an environment where you may need mixed Java versions, then running it as a container could isolate the dependencies (in this case, Java) from the host system. In some case, it would get you a more easily reproducible artifact to move around and set up. But Java apps are already so simple with all their dependencies included in the JAR.

Maybe should be only multi node Cassandra cluster dockerized?

I don't think it really comes down to whether is is a multi-node environment or not. It comes down to the problems it solves. It doesn't sound like you have any pain point in deploying or reproducing Hadoop environments (yet), so I don't see the need to "dockerize" something just because it is the hot new thing on the block.

When you do have the need to reproduce the Hadoop environment easily, you might look at Docker for some of the orchestration and management tools (Kubernetes, Rancher, etc.) which make deploying and managing clusters of applications on an overlay network much more appetizing than just regular Docker. Docker is just the tool in my eyes. It really starts to shine when you can leverage some of the neat overlay multi-host networking, discovery, and orchestration that other packages are building on top of it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!