问题
Given a file with data like this (ie stores.dat file)
id storeNo type
2ttfgdhdfgh 1gfdkl-28 kgdl
9dhfdhfdfh 2t-33gdm dgjkfndkgf
Desired output:
id |storeNo |type
2ttfgdhdfgh |1gfdkl-28 |kgdl
9dhfdhfdfh |2t-33gdm |dgjkfndkgf
Would like to add a "|" delimiter between each of these 3 cut ranges:
cut -c1-18,19-30,31-40 stores.dat
What is the syntax to insert a delimiter between each cut?
BONUS pts (if you can provide the option to trim the values like so):
id|storeNo|type
2ttfgdhdfgh|1gfdkl-28|kgdl
9dhfdhfdfh|2t-33gdm|dgjkfndkgf\
UPDATE (thanks to Mat's answer) I ended up with success on this solution - (it is a bit messy but SunOS with my bash version doesn't seem to support more elegant arithmetic)
#!/bin/bash
unpack=""
filename="$1"
while [ $# -gt 0 ] ; do
arg="$1"
if [ "$arg" != "$filename" ]
then
firstcharpos=`echo $arg | awk -F"-" '{print $1}'`
secondcharpos=`echo $arg | awk -F"-" '{print $2}'`
compute=`(expr $firstcharpos - $secondcharpos)`
compute=`(expr $compute \* -1 + 1)`
unpack=$unpack"A"$compute
fi
shift
done
perl -ne 'print join("|",unpack("'$unpack'", $_)), "\n";' $filename
Usage: sh test.sh input_file 1-17 18-29 30-39
回答1:
If you're not afraid of using perl, here's a one-liner:
$ perl -ne 'print join("|",unpack("A17A12A10", $_)), "\n";' input
The unpack call will extract one 17 char string, then a 12 char one, then a 10 char one from the input line, and return them in an array (stripping spaces). join adds the |s.
If you want the input columns to be in x-y format, without writing a "real" script, you could hack it like this (but it's ugly):
#!/bin/bash
unpack=""
while [ $# -gt 1 ] ; do
arg=$(($1))
shift
unpack=$unpack"A"$((-1*$arg+1))
done
perl -ne 'print join("|",unpack("'$unpack'", $_)), "\n";' $1
Usage: t.sh 1-17 18-29 30-39 input_file.
回答2:
Since you used cut in your example.
Assuming each field is separated with a tab:
$ cut --output-delimiter='|' -f1-3 input
id|store|No
2ttfgdhdfgh|1gfdkl-28|kgdl
9dhfdhfdfh|2t-33gdm|dgjkfndkgf
if that is not the case, add the input-separator switch -d
回答3:
I'd use awk:
awk '{print $1 "|" $2 "|" $3}'
Like some of the other suggestions, it assumes columns are whitespace separated, and doesn't care about the column numbers. If you have spaces in one of the fields, it won't work.
回答4:
Better awk solution based on character position, not whitespace
$ awk -v FIELDWIDTHS='17 12 10' -v OFS='|' '{ $1=$1 ""; print }' stores.dat | tr -d ' '
id|storeNo|type
2ttfgdhdfgh|1gfdkl-28|kgdl
9dhfdhfdfh|2t-33gdm|dgjkfndkgf
回答5:
use 'sed' to search and replace parts of a file based on regular expressions
Replace whitespace with '|' from infile1
sed -e 's/[ \t\r]/|/g' infile1 > outfile3
回答6:
You can't do that with cut as far as I am aware, but you can do it easily with sed as long as the values in each column never have internal spaces:
sed -e 's/ */|/g'
EDIT: If the file format is a true fixed-column format, and you don't want to use perl as shown by Mat, this can be done with sed but it's not pretty, because sed doesn't support numeric repetition quantifiers (.{17}), so you have to type out the right number of dots:
sed -e 's/^\(.................\)\(............\)\(..........\)$/\1|\2|\3/; s/ *|/|/g'
回答7:
How about using just tr command.
tr -s " " "|" < stores.dat
From the man page:
-s Squeeze multiple occurrences of the characters listed in the last
operand (either string1 or string2) in the input into a single
instance of the character. This occurs after all deletion and
translation is completed.
Test:
[jaypal:~/Temp] cat stores.dat
id storeNo type
2ttfgdhdfgh 1gfdkl-28 kgdl
9dhfdhfdfh 2t-33gdm dgjkfndkgf
[jaypal:~/Temp] tr -s " " "|" < stores.dat
id|storeNo|type
2ttfgdhdfgh|1gfdkl-28|kgdl
9dhfdhfdfh|2t-33gdm|dgjkfndkgf
You can easily redirect this to a new file like this -
[jaypal:~/Temp] tr -s " " "|" < stores.dat > new.stores.dat
Note: As Mat pointed out in the comments, this solution assumes each column is separated by one or more white-space and not separated by a fixed length.
来源:https://stackoverflow.com/questions/8630053/unix-cut-command-adding-own-delimiter