I\'ve been reading up on PHP file upload security and a few articles have recommended renaming the files. For example, the OWASP article Unrestricted File Upload says:
When I upload files I use PHP's unique_id() function for the filename that is stored on the server (and I preserve the file extension since it makes it easier for me when I am looking at all the files in the storage directory via the local file system).
I save the file outside of the website file system (aka you can never browse directly to the files).
I always use php's move_uploaded_file() function to save the file to the server.
I store the original filename, the path/filename where it is stored, and any other project related information you might need about who uploaded it, etc in a database.
In some of my implementations I also create a hash of the file contents and save that in the database too. Then with other uploaded files look in the database to see if I have a copy of that exact file already stored.
Some code examples:
The form:
form method="post" enctype="multipart/form-data" action="your_form_handler.php">
<input type="file" name="file1" value="" />
<input type="submit" name="b1" value="Upload File" />
</form>
The form handler:
<?php
// pass the file input name used in the form and any other pertinent info to store in the db, username in this example
_process_uploaded_file('file1', 'jsmith');
exit;
function _process_uploaded_file($file_key, $username='guest'){
if(array_key_exists($file_key, $_FILES)){
$file = $_FILES[$file_key];
if($file['size'] > 0){
$data_storage_path = '/path/to/file/storage/directory/';
$original_filename = $file['name'];
$file_basename = substr($original_filename, 0, strripos($original_filename, '.')); // strip extention
$file_ext = substr($original_filename, strripos($original_filename, '.'));
$file_md5_hash = md5_file($file['tmp_name']);
$stored_filename = uniqid();
$stored_filename .= $file_ext;
if(! move_uploaded_file($file['tmp_name'], $data_storage_path.$stored_filename)){
// unable to move, check error_log for details
return 0;
}
// insert a record into your db using your own mechanism ...
// $statement = "INSERT into yourtable (original_filename, stored_filename, file_md5_hash, username, activity_date) VALUES (?, ?, ?, ?, NOW())";
// success, all done
return 1;
}
}
return 0;
}
?>
Program to handle download requests
<?php
// Do all neccessary security checks etc to make sure the user is allowed to download the file, etc..
//
$file = '/path/to/your/storage/directory' . 'the_stored_filename';
$filesize = filesize($file);
header('Content-Description: File Transfer');
header("Content-type: application/forcedownload");
header("Content-disposition: attachment; filename=\"filename_to_display.example\"");
header("Content-Transfer-Encoding: Binary");
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header("Content-length: ".$filesize);
ob_clean();
flush();
readfile("$file");
exit;
If you want to present the download in the same page that the user is requesting it from then look at my answer to this post: Dowloading multiple PDF files from javascript
To your primary question, is it good practice to rename files, the answer is a definite yes, especially if you are creating a form of File Repository where users upload files (and filenames) of their choosing, for several reason:
Cake Recipe.doc
is not a URL safe name, and can on some systems (either server or browser side) / some situations, cause inconsistencies when the name should be a urlencode
d value. As for storing the information, you would typically do this in a database, no different than the need you have already, since you need a way to refer back to the file (who uploaded, what the name is, occassionally where it is stored, the time of upload, sometimes the size). You're simply adding to that the actual stored name of the file in addition to the user's name for the file.
The OWASP recommendation isn't a bad one -- using the filename and a timestamp (not date) would be mostly unique. I take it a step further to include the microtime with the timestamp, and often some other unique bit of information, so that a duplicate upload of a small file couldn't occur in the same timeframe -- I also store the date of the upload which is additional insurance against md5 clashes, which has a higher probability in systems that store many files and for years. It is incredibly unlikely that you would generate two like md5s, using filename and microtime, on the same day. An example would be:
$filename = date('Ymd') . '_' . md5($uploaded_filename . microtime());
My 2 cents.
There is a good reason you need to rename uploaded file and it is, if two upload same file, or files with same name, the latter file will replace the former file which is not favourable.
you can use hashing algos like
$extensions = explode(".",$file-name);
$ext = $extensions[count($extensions)-1];
$file-name = md5($file-name .$_SERVER['REMOTE_ADDR']) .'.' .$ext;
then you can save details of filename, hashed filename, uploader details, date, time to keep track of files