I\'m amazed at how good Docker\'s caching of layers works but I\'m also wondering how it determines whether it may use a cached layer or not.
Let\'s take these build
It's because your package.json file has been modified, see Removing intermediate container.
That's also usually the reason why package-manager (vendor/3rd-party) info files are COPY'ed first during docker build. After that you run the package-manager installation, and then you add the rest of your application, i.e. src.
If you've no changes to your libs, these steps are served from the build cache.
The build cache process is explained fairly thoroughly in the Dockerfile best practices build cache section.
Starting with a base image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.
In most cases simply comparing the instruction in the
Dockerfilewith one of the child images is sufficient. However, certain instructions require a little more examination and explanation.For the
ADDandCOPYinstructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. The last-modified and last-accessed times of the file(s) are not considered in these checksums. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated.Aside from the
ADDandCOPYcommands, cache checking will not look at the files in the container to determine a cache match. For example, when processing aRUN apt-get -y updatecommand the files updated in the container will not be examined to determine if a cache hit exists. In that case just the command string itself will be used to find a match.Once the cache is invalidated, all subsequent
Dockerfilecommands will generate new images and the cache will not be used.
You will run into situations where OS packages, NPM packages or a Git repo are updated to newer versions (say a ~2.3 semver in package.json) but as your Dockerfile or package.json hasn't updated, docker will continue using the cache.
It's possible to programatically generate a Dockerfile that busts the cache by modifying lines on certain smarter checks (e.g retrieve the latest git branch shasum from a repo to use in the clone instruction). You can also periodically run the build with --no-cache=true to enforce updates.