Why are Docker container images so large?

前端 未结 8 1908
被撕碎了的回忆
被撕碎了的回忆 2020-11-30 17:32

I made a simple image through Dockerfile from Fedora (initially 320 MB).

Added Nano (this tiny editor of 1MB size), and the size of the image has risen to 530 MB. I\

相关标签:
8条回答
  • 2020-11-30 17:41

    Docker Squash is a really nice solution to this. you can $packagemanager clean in the last step instead of in every line and then just run a docker squash to get rid of all of the layers.

    https://github.com/jwilder/docker-squash

    0 讨论(0)
  • 2020-11-30 17:50

    Here are some more things you can do:

    • Avoid multiple RUN commands where you can. Put as much as possbile into one RUN command (using &&)
    • clean-up unnecessary tools like wget or git (which you only need for download or building stuff, but not to run your process)

    With these both AND the recommendations from @Andy and @michau I was able to resize my nodejs image from 1.062 GB to 542 MB.

    Edit: One more important thing: "It took me a while to really understand that each Dockerfile command creates a new container with the deltas. [...] It doesn't matter if you rm -rf the files in a later command; they continue exist in some intermediate layer container." So now I managed to put apt-get install, wget, npm install (with git dependencies) and apt-get remove into a single RUN command, so now my image has only 438 MB.

    Edit 29/06/17

    With Docker v17.06 there comes a new features for Dockerfiles: You can have multiple FROM statements inside one Dockerfile and only the stuff from last FROM will be in your final Docker image. This is useful to reduce image size, for example:

    FROM nodejs as builder
    WORKDIR /var/my-project
    RUN apt-get install ruby python git openssh gcc && \
        git clone my-project . && \
        npm install
    
    FROM nodejs
    COPY --from=builder /var/my-project /var/my-project
    

    Will result in an image having only the nodejs base image plus the content from /var/my-project from the first steps - but without the ruby, python, git, openssh and gcc!

    0 讨论(0)
  • 2020-11-30 17:52

    Docker images are not large, you are just building large images.

    The scratch image is 0B and you can use that to package up your code if you can compile your code into a static binary. For example, you can compile your Go program and package it on top of scratch to make a fully usable image that is less than 5MB.

    The key is to not use the official Docker images, they are too big. Scratch isn't all that practical either so I'd recommend using Alpine Linux as your base image. It is ~5MB, then only add what is required for your app. This post about Microcontainers shows you how to build very small images base on Alpine.

    UPDATE: the official Docker images are based on alpine now so they are good to use now.

    0 讨论(0)
  • 2020-11-30 17:55

    For best practise, you should execute a single RUN command, because every RUN instruction in the Dockerfile writes a new layer in the image and every layer requires extra space on disk. In order to keep the number layers to a minimum, any file manipulation like install, moving, extracting, removing, etc, should ideally be made under a single RUN instruction

    FROM fedora:latest
    RUN yum -y install nano git && yum -y clean all
    
    0 讨论(0)
  • 2020-11-30 17:57

    Yes the layer system is quite surprising. If you have a base image and you increment it by doing the following:

    # Test
    #
    # VERSION       1
    
    # use the centos base image provided by dotCloud
    FROM centos7/wildfly
    MAINTAINER JohnDo 
    
    # Build it with: docker build -t "centos7/test" test/
    
    # Change user into root
    USER root
    
    # Extract weblogic
    RUN rm -rf /tmp/* \
        && rm -rf /wildfly/* 
    

    The image has exactly the same size. That essentially means, you have to manage to put into your RUN steps a lot of extract, install and cleanup magic to make the images as small as the software installed.

    This makes life much harder...

    The dockerBuild is missing RUN steps without commit.

    0 讨论(0)
  • 2020-11-30 18:00

    As @rexposadas said, images include all the layers and each layer includes all the dependencies for what you installed. It is also important to note that the base images (like fedora:latest tend to be very bare-bones. You may be surprised by the number of dependencies your installed software has.

    I was able to make your installation significantly smaller by adding yum -y clean all to each line:

    FROM fedora:latest
    RUN yum -y install nano && yum -y clean all
    RUN yum -y install git && yum -y clean all
    

    It is important to do that for each RUN, before the layer gets committed, or else deletes don't actually remove data. That is, in a union/copy-on-write file system, cleaning at the end doesn't really reduce file system usage because the real data is already committed to lower layers. To get around this you must clean at each layer.

    $ docker history bf5260c6651d
    IMAGE               CREATED             CREATED BY                                      SIZE
    bf5260c6651d        4 days ago          /bin/sh -c yum -y install git; yum -y clean a   260.7 MB
    172743bd5d60        4 days ago          /bin/sh -c yum -y install nano; yum -y clean    12.39 MB
    3f2fed40e4b0        2 weeks ago         /bin/sh -c #(nop) ADD file:cee1a4fcfcd00d18da   372.7 MB
    fd241224e9cf        2 weeks ago         /bin/sh -c #(nop) MAINTAINER Lokesh Mandvekar   0 B
    511136ea3c5a        12 months ago                                                       0 B
    
    0 讨论(0)
提交回复
热议问题