How can I convert tabs to spaces in every file of a directory (possibly recursively)?
Also, is there a way of setting the number of spaces per tab?
Warning: This will break your repo.
This will corrupt binary files, including those under
svn,.git! Read the comments before using!
find . -iname '*.java' -type f -exec sed -i.orig 's/\t/ /g' {} +
The original file is saved as [filename].orig.
Replace '*.java' with the file ending of the file type you are looking for. This way you can prevent accidental corruption of binary files.
Downsides:
No body mentioned rpl? Using rpl you can replace any string.
To convert tabs to spaces,
rpl -R -e "\t" " " .
very simple.
Collecting the best comments from Gene's answer, the best solution by far, is by using sponge from moreutils.
sudo apt-get install moreutils
# The complete one-liner:
find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;
Explanation:
./ is recursively searching from current directory-iname is a case insensitive match (for both *.java and *.JAVA likes)type -f finds only regular files (no directories, binaries or symlinks)-exec bash -c execute following commands in a subshell for each file name, {}expand -t 4 expands all TABs to 4 spacessponge soak up standard input (from expand) and write to a file (the same one)*. NOTE: * A simple file redirection (> "$0") won't work here because it would overwrite the file too soon.
Advantage: All original file permissions are retained and no intermediate tmp files are used.
How can I convert tabs to spaces in every file of a directory (possibly recursively)?
This is usually not what you want.
Do you want to do this for png images? PDF files? The .git directory? Your
Makefile (which requires tabs)? A 5GB SQL dump?
You could, in theory, pass a whole lot of exlude options to find or whatever
else you're using; but this is fragile, and will break as soon as you add other
binary files.
What you want, is at least:
expand does this, sed
doesn't).As far as I know, there is no "standard" Unix utility that can do this, and it's not very easy to do with a shell one-liner, so a script is needed.
A while ago I created a little script called
sanitize_files which does exactly
that. It also fixes some other common stuff like replacing \r\n with \n,
adding a trailing \n, etc.
You can find a simplified script without the extra features and command-line arguments below, but I recommend you use the above script as it's more likely to receive bugfixes and other updated than this post.
I would also like to point out, in response to some of the other answers here,
that using shell globbing is not a robust way of doing this, because sooner
or later you'll end up with more files than will fit in ARG_MAX (on modern
Linux systems it's 128k, which may seem a lot, but sooner or later it's not
enough).
#!/usr/bin/env python
#
# http://code.arp242.net/sanitize_files
#
import os, re, sys
def is_binary(data):
return data.find(b'\000') >= 0
def should_ignore(path):
keep = [
# VCS systems
'.git/', '.hg/' '.svn/' 'CVS/',
# These files have significant whitespace/tabs, and cannot be edited
# safely
# TODO: there are probably more of these files..
'Makefile', 'BSDmakefile', 'GNUmakefile', 'Gemfile.lock'
]
for k in keep:
if '/%s' % k in path:
return True
return False
def run(files):
indent_find = b'\t'
indent_replace = b' ' * indent_width
for f in files:
if should_ignore(f):
print('Ignoring %s' % f)
continue
try:
size = os.stat(f).st_size
# Unresolvable symlink, just ignore those
except FileNotFoundError as exc:
print('%s is unresolvable, skipping (%s)' % (f, exc))
continue
if size == 0: continue
if size > 1024 ** 2:
print("Skipping `%s' because it's over 1MiB" % f)
continue
try:
data = open(f, 'rb').read()
except (OSError, PermissionError) as exc:
print("Error: Unable to read `%s': %s" % (f, exc))
continue
if is_binary(data):
print("Skipping `%s' because it looks binary" % f)
continue
data = data.split(b'\n')
fixed_indent = False
for i, line in enumerate(data):
# Fix indentation
repl_count = 0
while line.startswith(indent_find):
fixed_indent = True
repl_count += 1
line = line.replace(indent_find, b'', 1)
if repl_count > 0:
line = indent_replace * repl_count + line
data = list(filter(lambda x: x is not None, data))
try:
open(f, 'wb').write(b'\n'.join(data))
except (OSError, PermissionError) as exc:
print("Error: Unable to write to `%s': %s" % (f, exc))
if __name__ == '__main__':
allfiles = []
for root, dirs, files in os.walk(os.getcwd()):
for f in files:
p = '%s/%s' % (root, f)
if do_add:
allfiles.append(p)
run(allfiles)
Converting tabs to space in just in ".lua" files [tabs -> 2 spaces]
find . -iname "*.lua" -exec sed -i "s#\t# #g" '{}' \;
Try the command line tool expand.
expand -i -t 4 input | sponge output
where
-i is used to expand only leading tabs on each line;-t 4 means that each tab will be converted to 4 whitespace chars (8 by default).Finally, you can use gexpand on OSX, after installing coreutils with Homebrew (brew install coreutils).