Multiple RUN vs. single chained RUN in Dockerfile, what is better?

后端 未结 4 1470
轻奢々
轻奢々 2020-11-29 16:16

Dockerfile.1 executes multiple RUN:

FROM busybox
RUN echo This is the A > a
RUN echo This is the B > b
RUN echo This is the C         


        
4条回答
  •  北荒
    北荒 (楼主)
    2020-11-29 17:04

    Official answer listed in their best practices ( official images MUST adhere to these )

    Minimize the number of layer

    You need to find the balance between readability (and thus long-term maintainability) of the Dockerfile and minimizing the number of layers it uses. Be strategic and cautious about the number of layers you use.

    Since docker 1.10 the COPY, ADD and RUN statements add a new layer to your image. Be cautious when using these statements. Try to combine commands into a single RUN statement. Separate this only if it's required for readability.

    More info: https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#/minimize-the-number-of-layers

    Update: Multi stage in docker >17.05

    With multi-stage builds you can use multiple FROM statements in your Dockerfile. Each FROM statement is a stage and can have its own base image. In the final stage you use a minimal base image like alpine, copy the build artefacts from previous stages and install runtime requirements. The end result of this stage is your image. So this is where you worry about the layers as described earlier.

    As usual, docker has great docs on multi-stage builds. Here's a quick excerpt:

    With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.

    A great blog post about this can be found here: https://blog.alexellis.io/mutli-stage-docker-builds/

    To answer your points:

    1. Yes, layers are sort of like diffs. I don't think there are layers added if there's absolutely zero changes. The problem is that once you install / download something in layer #2, you can not remove it in layer #3. So once something is written in a layer, the image size can not be decreased anymore by removing that.

    2. Although layers can be pulled in parallel, making it potentially faster, each layer undoubtedly increases the image size, even if they're removing files.

    3. Yes, caching is useful if you're updating your docker file. But it works in one direction. If you have 10 layers, and you change layer #6, you'll still have to rebuild everything from layer #6-#10. So it's not too often that it will speed the build process up, but it's guaranteed to unnecessarily increase the size of your image.


    Thanks to @Mohan for reminding me to update this answer.

提交回复
热议问题