Given: One big text-data file (e.g. CSV format) with a \'special\' first line (e.g., field names).
Wanted: An equivalent of the cor
I really liked Rob and Dennis' versions, so much so that I wanted to improve them.
Here's my version:
in_file=$1
awk '{if (NR!=1) {print}}' $in_file | split -d -a 5 -l 100000 - $in_file"_" # Get all lines except the first, split into 100,000 line chunks
for file in $in_file"_"*
do
tmp_file=$(mktemp $in_file.XXXXXX) # Create a safer temp file
head -n 1 $in_file | cat - $file > $tmp_file # Get header from main file, cat that header with split file contents to temp file
mv -f $tmp_file $file # Overwrite non-header containing file with header-containing file
done
Differences:
awk instead of tail due to awk having better performancehead | cat line instead of two lines