I have a script which connects to database and gets all records which statisfy the query. These record results are files present on a server, so now I have a text file which
In perl, the -s
filetest operator is probaby what you want.
use strict;
use warnings;
use File::Copy;
my $folderpath = 'the_path';
my $destination = 'path/to/destination/directory';
open my $IN, '<', 'path/to/infile';
my $total;
while (<$IN>) {
chomp;
my $size = -s "$folderpath/$_";
print "$_ => $size\n";
$total += $size;
move("$folderpath/$_", "$destination/$_") or die "Error when moving: $!";
}
print "Total => $total\n";
Note that -s
gives size in bytes not blocks like du
.
On further investigation, perl's -s
is equivalent to du -b
. You should probably read the man pages on your specific du
to make sure that you are actually measuring what you intend to measure.
If you really want the du
values, change the assignment to $size
above to:
my ($size) = split(' ', `du "$folderpath/$_"`);
Eyeballing, you can make YOUR script work this way:
1) Delete the line filename=filename.replace(' ', '\ ')
Escaping is more complicated than that, and you should just quote the full path or use a Python library to escape it based on the specific OS;
2) You are probably missing a delimiter between the path and the file name;
3) You need single quotes around the full path in the call to os.system.
This works for me:
#!/usr/bin/python
import os
folderpath='/Users/andrew/bin'
file=open('ft.txt','r')
for line in file:
filename=line.strip()
fullpath=folderpath+"/"+filename
os.system('du -h '+"'"+fullpath+"'")
The file "ft.txt" has file names with no path and the path part is '/Users/andrew/bin'
. Some of the files have names that would need to be escaped, but that is taken care of with the single quotes around the file name.
That will run du -h
on each file in the .txt file, but does not give you the total. This is fairly easy in Perl or Python.
Here is a Python script (based on yours) to do that:
#!/usr/bin/python
import os
folderpath='/Users/andrew/bin/testdir'
file=open('/Users/andrew/bin/testdir/ft.txt','r')
blocks=0
i=0
template='%d total files in %d blocks using %d KB\n'
for line in file:
i+=1
filename=line.strip()
fullpath=folderpath+"/"+filename
if(os.path.exists(fullpath)):
info=os.stat(fullpath)
blocks+=info.st_blocks
print `info.st_blocks`+"\t"+fullpath
else:
print '"'+fullpath+"'"+" not found"
print `blocks`+"\tTotal"
print " "+template % (i,blocks,blocks*512/1024)
Notice that you do not have to quote or escape the file name this time; Python does it for you. This calculates file sizes using allocation blocks; the same way that du does it. If I run du -ahc
against the same files that I have listed in ft.txt
I get the same number (well kinda; du
reports it as 25M
and I get the report as 24324 KB
) but it reports the same number of blocks. (Side note: "blocks" are always assumed to be 512 bytes under Unix even though the actual block size on larger disc is always larger.)
Finally, you may want to consider making your script so that it can read a command line group of files rather than hard coding the file and the path in the script. Consider:
#!/usr/bin/python
import os, sys
total_blocks=0
total_files=0
template='%d total files in %d blocks using %d KB\n'
print
for arg in sys.argv[1:]:
print "processing: "+arg
blocks=0
i=0
file=open(arg,'r')
for line in file:
abspath=os.path.abspath(arg)
folderpath=os.path.dirname(abspath)
i+=1
filename=line.strip()
fullpath=folderpath+"/"+filename
if(os.path.exists(fullpath)):
info=os.stat(fullpath)
blocks+=info.st_blocks
print `info.st_blocks`+"\t"+fullpath
else:
print '"'+fullpath+"'"+" not found"
print "\t"+template % (i,blocks,blocks*512/1024)
total_blocks+=blocks
total_files+=i
print template % (total_files,total_blocks,total_blocks*512/1024)
You can then execute the script (after chmod +x [script_name].py
) by ./script.py ft.txt
and it will then use the path to the command line file as the assumed path to the files "ft.txt". You can process multiple files as well.
You can use the Python skeleton that you've sketched out and add os.path.getsize(fullpath)
to get the size of individual file.
For example, if you wanted a dictionary with the file name and size you could:
dict((f, os.path.getsize(f)) for f in file)
Keep in mind that the result from os.path.getsize(...)
is in bytes so you'll have to convert it to get other units if you want.
In general os.path
is a key module for manipulating files and paths.
You can do it in your shell script itself.
You have all the files names in your spooled file output.txt
, all you have to add at the end of existing script is:
< output.txt du -h
It will give size of each file and also a total at the end.