Images in database vs file system

前端 未结 10 807
甜味超标
甜味超标 2020-11-29 23:23

We have a project coming up where we will be building a whole backend CMS system that will power our entire extranet and intranet with one package. The question I have been

相关标签:
10条回答
  • 2020-11-29 23:29

    Replication of static files, especially across a number of servers, can be difficult to manage. It really comes down to a tradeoff between managing, monitoring and debugging replication problems vs. the database size and load.

    I think I'd probably pick the database approach, and if load became an issue look at putting up some sort of cache layer around the image calls.

    Suggestions to store a path in the db are missing the real problem, which is replicating this across multiple machines.

    0 讨论(0)
  • 2020-11-29 23:30

    There are valid concerns on either side of the debate, so always give your requirements. How much data, how many images, how large?

    Inline / BLOB storage

    Upside: simplifies architecture and implementation, simplifies backup and recovery or migration of the system; just do a dump, backup, export (whatever the term for your flavor of DB) and move it to the new database. Version control / consistency is handled by the DB, so allows for point-in-time recovery. Security / access control is also cleaner, since access to an image BLOB is intrinsic to access to the overall row. Moving the image outside the DB and letting the HTTP server fetch it up, while better for concurrency and scalability, can have problems with ensuring people cannot hack URLs and request images they don't own. If you do house them outside the DB, make sure either your security policy covers access control of images between users. Either your HTTP server authentication has to integrate with the overall system's authentication, or your HTTP server program that serves up the images uses some sort of session mechanism to ensure the HTTP request is valid. This is a very big concern in multi-tenant databases. Less of a concern in single purpose, single-tenant systems, with simple authentication.

    Downside: For really REALLY large databases, the backup and recovery gets frustrating, or even problematic and costly, because where you may have a small core dataset otherwise, you may have many GB or TB of image data. Treating it all as one consistent database is both good from integrity point of view, but bad for backups unless you use DBMSes with enterprise quality, data warehouse tuned backup and recovery (example is Oracle RMAN and rolling backups).

    Always consider time to recovery in any system. If your storage requirements are < a few gigabytes, say 50-100GB even, and you have plenty of backup space planned, inline storage is cleaner. Above that, separation of concerns and letting the filesystem do its job becomes a key advantage. Nothing is worse than trying to restore, recover and open a huge database for the sake of a small data error. Recovery time would be my biggest concern.

    0 讨论(0)
  • 2020-11-29 23:31

    Generally, persisting image data in the DB might not be as efficient as the FileSystem, as far as a CMS is concerned. At one time you probably just want to display the image statically, at other times you want that image to be available to your graphic designers for updates etc.

    Consider the processing overhead associated with retrieving the image each time you want to work with it.

    A few points why you should consider the FileSystem

    1. The browser does all the work, and the you benefit from proxy caching of images etc
    2. As an offshoot of the above, you get to easily use Content Delivery Networks (CDN)
    3. Replication of image data is easy with tools like rsync etc
    4. Processing (i.e. CPU) time is drastically optimized
    0 讨论(0)
  • 2020-11-29 23:34

    Well, if your top two needs are integrity and replication, then the answer is definitely DB.

    You other points though:

    • Integrity - DB, that's why databases exist vs. flat file systems.

    • Replication - Not sure if you mean image replication, but if so, then obviously DB as you won't be load balancing this, surely.

    • Multiple resolutions can be performed from the DB image, however this adds processing costs. Also, the higher the resolution, the greater the size, the longer the network wait. Multiple resolutions trades space for speed.

    • Speed - Depending on access to the images, it could be negligible. If you are taking images across a file share, you'll have to wait on the network in any case and the network is pretty much always the bottleneck.

    • Overhead - Frankly, it depends on your definition of overhead and how you access the images.

    • Management, DB, hands down. Singular storage = One less worry, and you should always be running backups on the database in any case. File system backups over multiple servers is costly in many ways.

    0 讨论(0)
  • 2020-11-29 23:39

    Your concerns break down into two camps. The following concerns favour storing documents in the database:

    • Data integrity
    • Data replication
    • Multiple resolutions
    • Data management and backup

    These concerns (probably) favour storing documents on the file system:

    • Speed of database vs file system
    • Overhead load of database vs file system

    So, decide what matters the most and choose accordingly.

    0 讨论(0)
  • 2020-11-29 23:39

    Assuming you are in a windows environment there is no great reason to use the file system. You may want to be careful how you store the images in the tables to avoid unwanted page splits, but that's a performance tweak, not a huge issue.

    Downsides to filesystem

    -Not automatically replicated

    -May complicate your replication by having different physical locations for every instance

    -Slow with very large numbers of files

    Upside to the filesystem

    -If you're storing a few very large files, it will perform a bit better.

    0 讨论(0)
提交回复
热议问题