Image upload storage strategies

后端 未结 7 632
天命终不由人
天命终不由人 2020-12-04 07:01

When a user uploads an image to my site, the image goes through this process;

  • user uploads pic
  • store pic metadata in db, giving the image a unique id
7条回答
  •  隐瞒了意图╮
    2020-12-04 07:30

    I've answered a similar question before but I can't find it, maybe the OP deleted his question...

    Anyway, Adams solution seems to be the best so far, yet it isn't bulletproof since images/c/cf/ (or any other dir/subdir pair) could still contain up to 16^30 unique hashes and at least 3 times more files if we count image extensions, a lot more than any regular file system can handle.

    AFAIK, SourceForge.net also uses this system for project repositories, for instance the "fatfree" project would be placed at projects/f/fa/fatfree/, however I believe they limit project names to 8 chars.


    I would store the image hash in the database along with a DATE / DATETIME / TIMESTAMP field indicating when the image was uploaded / processed and then place the image in a structure like this:

    images/
      2010/                                      - Year
        04/                                      - Month
          19/                                    - Day
            231c2ee287d639adda1cdb44c189ae93.png - Image Hash
    

    Or:

    images/
      2010/                                    - Year
        0419/                                  - Month & Day (12 * 31 = 372)
          231c2ee287d639adda1cdb44c189ae93.png - Image Hash
    

    Besides being more descriptive, this structure is enough to host hundreds of thousands (depending on your file system limits) of images per day for several thousand years, this is the way Wordpress and others do it, and I think they got it right on this one.

    Duplicated images could be easily queried on the database and you'd just have to create symlinks.

    Of course, if this is not enough for you, you can always add more subdirs (hours, minutes, ...).

    Personally I wouldn't use user IDs unless you don't have that info available in your database, because:

    1. Disclosure of usernames in the URL
    2. Usernames are volatile (you may be able to rename folders, but still...)
    3. A user can hypothetically upload a large number of images
    4. Serves no purpose (?)

    Regarding the CDN I don't see any reason this scheme (or any other) wouldn't work...

提交回复
热议问题