How can I find all of the distinct file extensions in a folder hierarchy?

后端 未结 16 1248
梦谈多话
梦谈多话 2020-11-30 16:00

On a Linux machine I would like to traverse a folder hierarchy and get a list of all of the distinct file extensions within it.

What would be the best way to achieve

相关标签:
16条回答
  • 2020-11-30 16:40

    No need for the pipe to sort, awk can do it all:

    find . -type f | awk -F. '!a[$NF]++{print $NF}'
    
    0 讨论(0)
  • 2020-11-30 16:40

    My awk-less, sed-less, Perl-less, Python-less POSIX-compliant alternative:

    find . -type f | rev | cut -d. -f1 | rev  | tr '[:upper:]' '[:lower:]' | sort | uniq --count | sort -rn
    

    The trick is that it reverses the line and cuts the extension at the beginning.
    It also converts the extensions to lower case.

    Example output:

       3689 jpg
       1036 png
        610 mp4
         90 webm
         90 mkv
         57 mov
         12 avi
         10 txt
          3 zip
          2 ogv
          1 xcf
          1 trashinfo
          1 sh
          1 m4v
          1 jpeg
          1 ini
          1 gqv
          1 gcs
          1 dv
    
    0 讨论(0)
  • 2020-11-30 16:42

    Find everythin with a dot and show only the suffix.

    find . -type f -name "*.*" | awk -F. '{print $NF}' | sort -u
    

    if you know all suffix have 3 characters then

    find . -type f -name "*.???" | awk -F. '{print $NF}' | sort -u
    

    or with sed shows all suffixes with one to four characters. Change {1,4} to the range of characters you are expecting in the suffix.

    find . -type f | sed -n 's/.*\.\(.\{1,4\}\)$/\1/p'| sort -u
    
    0 讨论(0)
  • 2020-11-30 16:43

    Since there's already another solution which uses Perl:

    If you have Python installed you could also do (from the shell):

    python -c "import os;e=set();[[e.add(os.path.splitext(f)[-1]) for f in fn]for _,_,fn in os.walk('/home')];print '\n'.join(e)"
    
    0 讨论(0)
  • 2020-11-30 16:49

    None of the replies so far deal with filenames with newlines properly (except for ChristopheD's, which just came in as I was typing this). The following is not a shell one-liner, but works, and is reasonably fast.

    import os, sys
    
    def names(roots):
        for root in roots:
            for a, b, basenames in os.walk(root):
                for basename in basenames:
                    yield basename
    
    sufs = set(os.path.splitext(x)[1] for x in names(sys.argv[1:]))
    for suf in sufs:
        if suf:
            print suf
    
    0 讨论(0)
  • 2020-11-30 16:50

    I don't think this one was mentioned yet:

    find . -type f -exec sh -c 'echo "${0##*.}"' {} \; | sort | uniq -c
    
    0 讨论(0)
提交回复
热议问题