find string inside a gzipped file in a folder

前端 未结 7 1510
野性不改
野性不改 2020-12-13 17:26

My current problem is that I have around 10 folders, which contain gzipped files (around on an average 5 each). This makes it 50 files to open and look at.

Is there

7条回答
  •  挽巷
    挽巷 (楼主)
    2020-12-13 17:50

    Coming in a bit late on this, had a similar problem and was able to resolve using;

    zcat -r /some/dir/here | grep "blah"
    

    As detailed here;

    http://manpages.ubuntu.com/manpages/quantal/man1/gzip.1.html

    However, this does not show the original file that the result matched from, instead showing "(standard input)" as it's coming in from a pipe. zcat does not seem to support outputting a name either.

    In terms of performance, this is what we got;

    $ alias dropcache="sync && echo 3 > /proc/sys/vm/drop_caches"
    
    $ find 09/01 | wc -l
    4208
    
    $ du -chs 09/01
    24M
    
    $ dropcache; time zcat -r 09/01 > /dev/null
    real    0m3.561s
    
    $ dropcache; time find 09/01 -iname '*.txt.gz' -exec zcat '{}' \; > /dev/null
    0m38.041s
    

    As you can see, using the find|zcat method is significantly slower than using zcat -r when dealing with even a small volume of files. I was also unable to make zcat output the file name (using -v will apparently output the filename, but not on every single line). It would appear that there isn't currently a tool that will provide both speed and name consistency with grep (i.e. the -H option).

    If you need to identify the name of the file that the result belongs to, then you'll need to either write your own tool (could be done in 50 lines of Python code) or use the slower method. If you do not need to identify the name, then use zcat -r.

    Hope this helps

提交回复
热议问题