Am trying to grep pattern from dozen files .tar.gz but its very slow
am using
tar -ztf file.tar.gz | while read FILENAME
do
if tar -zxf file
For starters, you could start more than one process:
tar -ztf file.tar.gz | while read FILENAME
do
(if tar -zxf file.tar.gz "$FILENAME" -O | grep -l "string"
then
echo "$FILENAME contains string"
fi) &
done
The ( ... ) &
creates a new detached (read: the parent shell does not wait for the child)
process.
After that, you should optimize the extracting of your archive. The read is no problem, as the OS should have cached the file access already. However, tar needs to unpack the archive every time the loop runs, which can be slow. Unpacking the archive once and iterating over the result may help here:
local tempPath=`tempfile`
mkdir $tempPath && tar -zxf file.tar.gz -C $tempPath &&
find $tempPath -type f | while read FILENAME
do
(if grep -l "string" "$FILENAME"
then
echo "$FILENAME contains string"
fi) &
done && rm -r $tempPath
find
is used here, to get a list of files in the target directory of tar
, which we're iterating over, for each file searching for a string.
Edit: Use grep -l
to speed up things, as Jim pointed out. From man grep
:
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would
normally have been printed. The scanning will stop on the first match. (-l is specified
by POSIX.)