How to split a file and keep the first line in each of the pieces?

前端 未结 12 795
-上瘾入骨i
-上瘾入骨i 2020-12-07 18:28

Given: One big text-data file (e.g. CSV format) with a \'special\' first line (e.g., field names).

Wanted: An equivalent of the cor

12条回答
  •  忘掉有多难
    2020-12-07 18:59

    I really liked Rob and Dennis' versions, so much so that I wanted to improve them.

    Here's my version:

    in_file=$1
    awk '{if (NR!=1) {print}}' $in_file | split -d -a 5 -l 100000 - $in_file"_" # Get all lines except the first, split into 100,000 line chunks
    for file in $in_file"_"*
    do
        tmp_file=$(mktemp $in_file.XXXXXX) # Create a safer temp file
        head -n 1 $in_file | cat - $file > $tmp_file # Get header from main file, cat that header with split file contents to temp file
        mv -f $tmp_file $file # Overwrite non-header containing file with header-containing file
    done
    

    Differences:

    1. in_file is the file argument you want to split maintaining headers
    2. Use awk instead of tail due to awk having better performance
    3. split into 100,000 line files instead of 4
    4. Split file name will be input file name appended with an underscore and numbers (up to 99999 - from the "-d -a 5" split argument)
    5. Use mktemp to safely handle temporary files
    6. Use single head | cat line instead of two lines

提交回复
热议问题