How does Docker know when to use the cache during a build and when not?

前端 未结 2 1162
温柔的废话
温柔的废话 2020-12-24 01:41

I\'m amazed at how good Docker\'s caching of layers works but I\'m also wondering how it determines whether it may use a cached layer or not.

Let\'s take these build

相关标签:
2条回答
  • 2020-12-24 02:27

    It's because your package.json file has been modified, see Removing intermediate container.

    That's also usually the reason why package-manager (vendor/3rd-party) info files are COPY'ed first during docker build. After that you run the package-manager installation, and then you add the rest of your application, i.e. src.

    If you've no changes to your libs, these steps are served from the build cache.

    0 讨论(0)
  • 2020-12-24 02:28

    The build cache process is explained fairly thoroughly in the Dockerfile best practices build cache section.

    • Starting with a base image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.

    • In most cases simply comparing the instruction in the Dockerfile with one of the child images is sufficient. However, certain instructions require a little more examination and explanation.

    • For the ADD and COPY instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.

    • Aside from the ADD and COPY commands, cache checking will not look at the files in the container to determine a cache match. For example, when processing a RUN apt-get -y update command the files updated in the container will not be examined to determine if a cache hit exists. In that case just the command string itself will be used to find a match.

    Once the cache is invalidated, all subsequent Dockerfile commands will generate new images and the cache will not be used.

    You will run into situations where OS packages, NPM packages or a Git repo are updated to newer versions (say a ~2.3 semver in package.json) but as your Dockerfile or package.json hasn't updated, docker will continue using the cache.

    It's possible to programatically generate a Dockerfile that busts the cache by modifying lines on certain smarter checks (e.g retrieve the latest git branch shasum from a repo to use in the clone instruction). You can also periodically run the build with --no-cache=true to enforce updates.

    0 讨论(0)
提交回复
热议问题