Finding gaps in sequential numbers

后端 未结 6 1065
粉色の甜心
粉色の甜心 2020-12-05 09:31

I don’t do this stuff for a living so forgive me if it’s a simple question (or more complicated than I think). I‘ve been digging through the archives and found a lot of tip

相关标签:
6条回答
  • 2020-12-05 10:14

    interesting question.

    sputnick's awk one-liner is nice. I cannot write a simpler one than his. I just add another way using diff:

     seq $(tail -1 file)|diff - file|grep -Po '.*(?=d)'
    

    the output with your example would be:

    1,4
    9,14
    18,24
    

    I knew that there is comma in it, instead of -. you could replace the grep with sed to get -, grep cannot change the input text... but the idea is same.

    hope it helps.

    0 讨论(0)
  • 2020-12-05 10:14

    Perl solution similar to awk solution from StardustOne:

    perl -ane 'if ($F[0] != $p+1) {printf "%d-%d\n",$p+1,$F[0]-1}; $p=$F[0]' file.txt
    

    These command-line options are used:

    • -n loop around every line of the input file, do not automatically print every line

    • -a autosplit mode – split input lines into the @F array. Defaults to splitting on whitespace. Fields are indexed starting with 0.

    • -e execute the perl code

    0 讨论(0)
  • 2020-12-05 10:25

    With awk :

    awk '$1!=p+1{print p+1"-"$1-1}{p=$1}' file.txt
    

    explanations

    • $1 is the first column from current input line
    • p is the previous value of the last line
    • so ($1!=p+1) is a condition : if $1 is different than previous value +1, then :
    • this part is executed : {print p+1 "-" $1-1} : print previous value +1, the - character and fist columns + 1
    • {p=$1} is executed for each lines : p is assigned to the current 1st column
    0 讨论(0)
  • 2020-12-05 10:26

    Just remember the previous number and verify that the current one is the previous plus one:

    #! /bin/bash
    previous=0
    while read n ; do
        if (( n != previous + 1 )) ; then
            echo $(( previous + 1 ))-$(( n - 1 ))
        fi
        previous=$n
    done
    

    You might need to add some checking to prevent lines like 28-28 for single number gaps.

    0 讨论(0)
  • 2020-12-05 10:26

    Given input file, use the numinterval util and paste its output beside file, then munge it with tr, xargs, sed and printf:

    gaps() { paste  <(echo; numinterval "$1" | tr 1 '-' | tr -d '[02-9]') "$1" | 
             tr -d '[:blank:]' | xargs echo | 
             sed 's/ -/-/g;s/-[^ ]*-/-/g' | xargs printf "%s\n" ; }
    

    Output of gaps file:

    5-8
    15-17
    25-27
    

    How it works. The output of paste <(echo; numinterval file) file looks like:

        5
    1   6
    1   7
    1   8
    7   15
    1   16
    1   17
    8   25
    1   26
    1   27
    

    From there we mainly replace things in column #1, and tweak the spacing. The 1s are replaced with -s, and the higher numbers are blanked. Remove some blanks with tr. Replace runs of hyphens like "5-6-7-8" with a single hyphen "5-8", and that's the output.

    0 讨论(0)
  • 2020-12-05 10:32

    A Ruby Answer

    Perhaps someone else can give you the Bash or Awk solution you asked for. However, I think any shell-based answer is likely to be extremely localized for your data set, and not very extendable. Solving the problem in Ruby is fairly simple, and provides you with flexible formatting and more options for manipulating the data set in other ways down the road. YMMV.

    #!/usr/bin/env ruby
    
    # You could read from a file if you prefer,
    # but this is your provided corpus. 
    nums = [5, 6, 7, 8, 15, 16, 17, 25, 26, 27]
    
    # Find gaps between zero and first digit.
    nums.unshift 0
    
    # Create array of arrays containing missing digits.
    missing_nums = nums.each_cons(2).map do |array|
                     (array.first.succ...array.last).to_a unless
                      array.first.succ == array.last
                   end.compact
    # => [[1, 2, 3, 4], [9, 10, 11, 12, 13, 14], [18, 19, 20, 21, 22, 23, 24]]
    
    # Format the results any way you want.
    puts missing_nums.map { |ary| "#{ary.first}-#{ary.last}" }
    

    Given your current corpus, this yields the following on standard output:

    1-4
    9-14
    18-24

    0 讨论(0)
提交回复
热议问题