How ETags are generated and configured?

前端 未结 2 2159
無奈伤痛
無奈伤痛 2020-12-22 05:58

I recently came through the concept of ETag HTTP header. (this) But I still have a problem that for a particular HTTP resource who is responsible to generate ET

2条回答
  •  余生分开走
    2020-12-22 06:33

    Overview of typical algorithms used in webservers. Consider we have a file with

    • Size 1047 i.e. 417 in hex.
    • MTime i.e. last modification on Mon, 06 Jan 2020 12:54:56 GMT which is 1578315296 seconds in unix time or 1578315296666771000 nanoseconds.
    • Inode which is a physical file number 66 i.e. 42 in hex

    Different webservers returns ETag like:

    • Nginx: "5e132e20-417" i.e. "hex(MTime)-hex(Size)". Not configurable.
    • BusyBox httpd the same as Nginx
    • Apache/2.2: "42-417-59b782a99f493" i.e. "hex(INode)-hex(Size)-hex(MTime in nanoseconds)". Can be configured but MTime anyway will be in nanos
    • Apache/2.4: "417-59b782a99f493" i.e. "hex(Size)-hex(MTime in nanoseconds)" i.e. without INode which is friendly for load balancing when identical file have different INode on different servers.
    • OpenWrt uhttpd: "42-417-5e132e20" i.e. "hex(INode)-hex(Size)-hex(MTime)". Not configurable.
    • Tomcat 9: W/"1047-1578315296666" i.e. Weak"Size-MTime in milliseconds". This is incorrect ETag because it should be strong as for a static file i.e. octal compatibility.
    • LightHTTPD: most weird: "hashcode(42-1047-1578315296666771000)" i.e. INode-Size-MTime but then reduced to a simple integer by hashcode. Can be configured but you can only disable one part (etag.use-inode = "disabled")
    • MS IIS: it have a form Filetimestamp:ChangeNumber e.g. "53dbd5819f62d61:0". Not documented, not configurable but can be disabled.
    • Jetty: based on last mod, size and hashed. See Resource.getWeakETag()
    • Kitura (Swift): "W/hex(Size)-hex(MTime)" StaticFileServer.calculateETag

    Few thoughts:

    • Hex numbers are used here so often because it's cheap to convert a decimal number to a shorter hex string.
    • Inode while adding more guarantees makes load balancing not possible and very fragile if you simply copied the file during application redeploy. MTime in nanoseconds is not available on all platforms and such granularity not needed.
    • Apache have a bug about this like https://bz.apache.org/bugzilla/show_bug.cgi?id=55573
    • The order MTime-Size or Size-MTime is also matters because MTime is more likely changed so comparing ETag string may be faster for a dozen CPU cycles.
    • Even if this is not a full checksum hash but definitely not a weak ETag. This is enough to show that we expect octal compatibility for Range requests.
    • Apache and Nginx shares almost all trafic in Internet but most static files are shared via Nginx and it is not configurable.

    It looks like Nginx uses the most reasonable schema so if you implementing try to make it the same. The whole ETag generated in C with one line:

    printf("\"%" PRIx64 "-%" PRIx64 "\"", last_mod, file_size)
    

    My proposition is to take Nginx schema and make it as a recommended ETag algorithm by W3C.

提交回复
热议问题