There are no standard conventions, but there a couple of best-practices:
Organizing your files into (User and/or Date) Aware Folders
Something like:
/uploads/USER/ or
/uploads/[USER/]YEAR/[MONTH/[DAY/[HOUR/[MINUTE/]]]]
This will have some benefits:
- organize files per user and/or date
- make it harder to reach the maximum number of files per directory
(Not) Renaming / Sanitizing Filenames
Renaming or not is a choice you will have to make, depending on your website, user base, how obscure you would like to be and, obviously your architecture. Would you prefer to have a file named kate_at_the_beach.jpg or 1304357611.jpg? This is really up to you to decide, but search engines (obviouslly) like the first one better.
One thing you should do is always sanitize and normalize the filenames, personally I would only allow the following chars: 0-9
, a-z
, A-Z
, _
, -
, .
- if you choose this sanitation alphabet. normalization basically means just converting the filename to either lower or upper case (to avoid losing files if for instance you switch from a case sensitive file-system to a case insensitive one, like Windows).
Here is some sample code I use in phunction (shameless plug, I know :P):
$filename = '/etc/hosts/@Álix Ãxel likes - beer?!.jpg';
$filename = Slug($filename, '_', '.'); // etc_hosts_alix_axel_likes_beer.jpg
function Slug($string, $slug = '-', $extra = null)
{
return strtolower(trim(preg_replace('~[^0-9a-z' . preg_quote($extra, '~') . ']+~i', $slug, Unaccent($string)), $slug));
}
function Unaccent($string) // normalizes (romanization) accented chars
{
if (strpos($string = htmlentities($string, ENT_QUOTES, 'UTF-8'), '&') !== false)
{
$string = html_entity_decode(preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|tilde|uml);~i', '$1', $string), ENT_QUOTES, 'UTF-8');
}
return $string;
}
Handling Duplicate Filenames
As the documentation entry on move_uploaded_file() states:
If the destination file already
exists, it will be overwritten.
So, before you call move_uploaded_file()
you better check if the file already exists, if it does then you should (if you don't want to lose your older file) rename your new file, usually appending a time / random / unique token before the file extension, doing something like this:
if (file_exists($output . $filename) === true)
{
$token = '_' . time(); // see below
$filename = substr_replace($filename, $token, strrpos($filename, '.'), 0);
}
move_uploaded_file($_FILES[$input]['tmp_name'], $output . $filename);
This will have the effect of inserting the $token
before the file extension, like I stated above. As for the choice of the $token
value you have several options:
time()
- ensures uniqueness every second but sucks handling duplicate files
- random - not a very good idea, since it doesn't ensure uniqueness and doesn't handle duplicates
- unique - using an hash of the file contents is my favorite approach, since it guarantees content uniqueness and saves you HD space since you'll only have at most 2 identical files (one with the original filename and another one with the hash appended), sample code:
(Dummy text so that the next line gets formatted as code.)
$token = '_' . md5_file($_FILES[$input]['tmp_name']);
Hope it helps! ;)