How can I find all of the distinct file extensions in a folder hierarchy?

后端未结

关注

 16  1283

On a Linux machine I would like to traverse a folder hierarchy and get a list of all of the distinct file extensions within it.

What would be the best way to achieve

相关标签:

16条回答

清歌不尽

2020-11-30 16:40
No need for the pipe to sort, awk can do it all:
```
find . -type f | awk -F. '!a[$NF]++{print $NF}'
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

情歌与酒

2020-11-30 16:40

My awk-less, sed-less, Perl-less, Python-less POSIX-compliant alternative:

find . -type f | rev | cut -d. -f1 | rev  | tr '[:upper:]' '[:lower:]' | sort | uniq --count | sort -rn

The trick is that it reverses the line and cuts the extension at the beginning.
It also converts the extensions to lower case.

Example output:

   3689 jpg
   1036 png
    610 mp4
     90 webm
     90 mkv
     57 mov
     12 avi
     10 txt
      3 zip
      2 ogv
      1 xcf
      1 trashinfo
      1 sh
      1 m4v
      1 jpeg
      1 ini
      1 gqv
      1 gcs
      1 dv

0 讨论(0)

故里飘歌

2020-11-30 16:42
Find everythin with a dot and show only the suffix.
```
find . -type f -name "*.*" | awk -F. '{print $NF}' | sort -u
```
if you know all suffix have 3 characters then
```
find . -type f -name "*.???" | awk -F. '{print $NF}' | sort -u
```
or with sed shows all suffixes with one to four characters. Change {1,4} to the range of characters you are expecting in the suffix.
```
find . -type f | sed -n 's/.*\.$.\{1,4\}$$/\1/p'| sort -u
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
萌比男神i

2020-11-30 16:43
Since there's already another solution which uses Perl:

If you have Python installed you could also do (from the shell):
```
python -c "import os;e=set();[[e.add(os.path.splitext(f)[-1]) for f in fn]for _,_,fn in os.walk('/home')];print '\n'.join(e)"
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
眼角桃花

2020-11-30 16:49
None of the replies so far deal with filenames with newlines properly (except for ChristopheD's, which just came in as I was typing this). The following is not a shell one-liner, but works, and is reasonably fast.
```
import os, sys

def names(roots):
    for root in roots:
        for a, b, basenames in os.walk(root):
            for basename in basenames:
                yield basename

sufs = set(os.path.splitext(x)[1] for x in names(sys.argv[1:]))
for suf in sufs:
    if suf:
        print suf
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
囚心锁ツ

2020-11-30 16:50
I don't think this one was mentioned yet:
```
find . -type f -exec sh -c 'echo "${0##*.}"' {} \; | sort | uniq -c
```
0 讨论(0)
发布评论:

提交评论
- 加载中...