How can I convert tabs to spaces in every file of a directory?

后端未结

关注

 19  1309

既然无缘

How can I convert tabs to spaces in every file of a directory (possibly recursively)?

Also, is there a way of setting the number of spaces per tab?

相关标签:

19条回答

天命终不由人

2020-12-02 03:59
Warning: This will break your repo.

This will corrupt binary files, including those under svn, .git! Read the comments before using!

find . -iname '*.java' -type f -exec sed -i.orig 's/\t/ /g' {} +

The original file is saved as [filename].orig.

Replace '*.java' with the file ending of the file type you are looking for. This way you can prevent accidental corruption of binary files.

Downsides:
- Will replace tabs everywhere in a file.
- Will take a long time if you happen to have a 5GB SQL dump in this directory.
0 讨论(0)
发布评论:

提交评论
- 加载中...
时光取名叫无心

2020-12-02 04:01
No body mentioned rpl? Using rpl you can replace any string. To convert tabs to spaces,
```
rpl -R -e "\t" "    "  .
```
very simple.
0 讨论(0)
发布评论:

提交评论
- 加载中...
星月不相逢

2020-12-02 04:05
Collecting the best comments from Gene's answer, the best solution by far, is by using sponge from moreutils.
```
sudo apt-get install moreutils
# The complete one-liner:
find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;
```
Explanation:
- ./ is recursively searching from current directory
- -iname is a case insensitive match (for both *.java and *.JAVA likes)
- type -f finds only regular files (no directories, binaries or symlinks)
- -exec bash -c execute following commands in a subshell for each file name, {}
- expand -t 4 expands all TABs to 4 spaces
- sponge soak up standard input (from expand) and write to a file (the same one)*.
NOTE: * A simple file redirection (> "$0") won't work here because it would overwrite the file too soon.

Advantage: All original file permissions are retained and no intermediate tmp files are used.
0 讨论(0)
发布评论:

提交评论
- 加载中...

暗喜

2020-12-02 04:06

How can I convert tabs to spaces in every file of a directory (possibly recursively)?

This is usually not what you want.

Do you want to do this for png images? PDF files? The .git directory? Your Makefile (which requires tabs)? A 5GB SQL dump?

You could, in theory, pass a whole lot of exlude options to find or whatever else you're using; but this is fragile, and will break as soon as you add other binary files.

What you want, is at least:

Skip files over a certain size.
Detect if a file is binary by checking for the presence of a NULL byte.
Only replace tabs at the start of a file (expand does this, sed doesn't).

As far as I know, there is no "standard" Unix utility that can do this, and it's not very easy to do with a shell one-liner, so a script is needed.

A while ago I created a little script called sanitize_files which does exactly that. It also fixes some other common stuff like replacing \r\n with \n, adding a trailing \n, etc.

You can find a simplified script without the extra features and command-line arguments below, but I recommend you use the above script as it's more likely to receive bugfixes and other updated than this post.

I would also like to point out, in response to some of the other answers here, that using shell globbing is not a robust way of doing this, because sooner or later you'll end up with more files than will fit in ARG_MAX (on modern Linux systems it's 128k, which may seem a lot, but sooner or later it's not enough).

#!/usr/bin/env python
#
# http://code.arp242.net/sanitize_files
#

import os, re, sys


def is_binary(data):
    return data.find(b'\000') >= 0


def should_ignore(path):
    keep = [
        # VCS systems
        '.git/', '.hg/' '.svn/' 'CVS/',

        # These files have significant whitespace/tabs, and cannot be edited
        # safely
        # TODO: there are probably more of these files..
        'Makefile', 'BSDmakefile', 'GNUmakefile', 'Gemfile.lock'
    ]

    for k in keep:
        if '/%s' % k in path:
            return True
    return False


def run(files):
    indent_find = b'\t'
    indent_replace = b'    ' * indent_width

    for f in files:
        if should_ignore(f):
            print('Ignoring %s' % f)
            continue

        try:
            size = os.stat(f).st_size
        # Unresolvable symlink, just ignore those
        except FileNotFoundError as exc:
            print('%s is unresolvable, skipping (%s)' % (f, exc))
            continue

        if size == 0: continue
        if size > 1024 ** 2:
            print("Skipping `%s' because it's over 1MiB" % f)
            continue

        try:
            data = open(f, 'rb').read()
        except (OSError, PermissionError) as exc:
            print("Error: Unable to read `%s': %s" % (f, exc))
            continue

        if is_binary(data):
            print("Skipping `%s' because it looks binary" % f)
            continue

        data = data.split(b'\n')

        fixed_indent = False
        for i, line in enumerate(data):
            # Fix indentation
            repl_count = 0
            while line.startswith(indent_find):
                fixed_indent = True
                repl_count += 1
                line = line.replace(indent_find, b'', 1)

            if repl_count > 0:
                line = indent_replace * repl_count + line

        data = list(filter(lambda x: x is not None, data))

        try:
            open(f, 'wb').write(b'\n'.join(data))
        except (OSError, PermissionError) as exc:
            print("Error: Unable to write to `%s': %s" % (f, exc))


if __name__ == '__main__':
    allfiles = []
    for root, dirs, files in os.walk(os.getcwd()):
        for f in files:
            p = '%s/%s' % (root, f)
            if do_add:
                allfiles.append(p)

    run(allfiles)

0 讨论(0)

不思量自难忘°

2020-12-02 04:08
Converting tabs to space in just in ".lua" files [tabs -> 2 spaces]
```
find . -iname "*.lua" -exec sed -i "s#\t#  #g" '{}' \;
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
野性不改

2020-12-02 04:09
Try the command line tool expand.
```
expand -i -t 4 input | sponge output
```
where
- -i is used to expand only leading tabs on each line;
- -t 4 means that each tab will be converted to 4 whitespace chars (8 by default).
- sponge is from the moreutils package, and avoids clearing the input file.
Finally, you can use gexpand on OSX, after installing coreutils with Homebrew (brew install coreutils).
0 讨论(0)
发布评论:

提交评论
- 加载中...

How can I convert tabs to spaces in every file of a directory?

Warning: This will break your repo.