Why randomize your file names for cloud storage/CDN?

我们两清 提交于 2019-12-03 08:35:34
Eric Hammond

One reason I cryptographically scramble identifiers in public URLs is so that the business' rate of growth is not always public.

If the current ids can be deduced simply by creating a new user account or uploading an image, then an outside person can calculate the growth rate (or an upper limit) by doing this on a regular basis and seeing how many ids were used during the elapsed time.

Whether it's stagnating or whether it's exploding exponentially, I want to be able to control the release of this information instead of letting competitors or business analysts be able to deduce it for themselves.

Offline examples of this are invoice and check numbers. If you get billed by or paid by a company on a regular basis, then you can see how many invoices or checks they write in that time period.

Here's a CPAN (Perl) module I maintain that scrambles 32-bit ids using two way encryption based on SkipJack:

http://metacpan.org/pod/Crypt::Skip32

It's a direct translation of the Skip32 algorithm written in C by Greg Rose:

http://www.qualcomm.com.au/PublicationsDocs/skip32.c

Use of this approach maps each 32-bit id into an (effectively random) corresponding 32-bit number which can be reversed back into the original id. You don't have to save anything extra in your database.

I convert the scrambled id into 8 hex digits for displaying in URLs.

Once your ids approach 4.29 billion (32-bits) you'll need to plan for extending the URL structure to support more, but I like having shorter URLs for as long as possible.

Changing URLs is a safe way to invalidate outdated assets.

It is also a necessity if you want to allow users storing private images. Using a path deductible from the users account name/id/path would render privacy settings useless as soon as you store assets on a CDN.

Mainly, it prevents name collisions. More than one person might upload "IMG_0001.JPG", for example. You also avoid limits on the number of files in one directory, and you can shard images across multiple servers - there's no way a huge site like Twitter or Facebook could store all photos on one server, no matter how large.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!