Perl. Using until function

故事扮演 提交于 2019-12-05 10:19:10

Perhaps the following will be helpful:

use strict;
use warnings;

my ( $last, $oneColumn );

my @file = <DATA>;

for my $line (@file) {
    my @entires = split ' ', $line;

    if ( @entires == 4 ) {
        if ($oneColumn) {
            print $line;    # Succeeding line
            $oneColumn = 0;
        }
        $last = $line;
        next;
    }

    print $last if $last;    # Preceeding line
    undef $last;
    print $line;             # One-column line
    $oneColumn = 1;

}

__DATA__
30  13387412    34.80391242 sSN_FIRST
30  13387412    34.80391242 sSN5_40
30.1             
30.2             
30.3             
30.4             
31  14740248    65.60590089 s32138223_44
31  14740248    65.60590089 s321382_LAST

Output:

30  13387412    34.80391242 sSN5_40
30.1
30.2
30.3
30.4
31  14740248    65.60590089 s32138223_44

A 'full', line should have four elements in @entries, and that's what if ( @entires == 4 ) looks for. If found, it'll print it as the succeeding line only if one-column lines have been printed. Then, it saves the line. Lines are printed outside the if only when the line doesn't have three tabs.

The following, shorter script produces the same output:

use strict;
use warnings;

my @file = <DATA>;

for ( my $i = 1 ; $i < $#file ; $i++ ) {

    if ( $file[$i] =~ /(?:\t\s){3}/ ) {
        print $file[ $i - 1 ];    # Preceeding line

        while ( $file[$i] =~ /(?:\t\s){3}/ and $i < $#file ) {
            print $file[ $i++ ]    # One-column line
        }

        print $file[$i];           # Succeeding line
    }
}

__DATA__
30  13387412    34.80391242 sSN_FIRST
30  13387412    34.80391242 sSN5_40
30.1             
30.2             
30.3             
30.4             
31  14740248    65.60590089 s32138223_44
31  14740248    65.60590089 s321382_LAST

The /(?:\t\s){3}/ matches three consecutive sets of tab and space, which would only be found on a line with just one column. When it finds that pattern, it prints the previous line, then enters a while loop that print the one-column lines until a full line is found or it at the end of the array. Finally, the succeeding line is printed.

What you want to implement is a caching algorithm: something that remembers (caches) previous values, and uses them if nothing new appears. You don't even need a regex for this. :)

In addition to caching the old values, you need to cache the lines inbetween. Since you only needed the labels, you only need to hold on to those. Then, when you reach the next full line, you can do your interpolation and emit the results.

Here's how I'd do it. It's a bit more complex than my original example, but the same principle applies: just store the intermediate lines, then emit the results when you reach your terminal.

use strict;
use warnings;
use feature 'say';


# Get start conditions, and cache those numbers.

sub read_block
{
   my $line = <DATA>;
   return 1 unless defined $line; # we're done if nothing more to read

   # Process and store data from the first line in the block.
   chomp $line;
   my ($last_label, $last_num1, $last_num2, $last_label2) = split /\t/, $line;

   # Keep reading lines until we find the end of the block.
   my @label_cache;
   my $found_last = 0;
   my ($label1, $num1, $num2, $label2);
   while (!$found_last)
   {
      $line = <DATA>;
      chomp $line;
      ($label1, $num1, $num2, $label2) = split /\t/, $line;
      if (defined $num1 && defined $num2)
      {
         $found_last = 1; # We have final numbers!  We can interpolate now.
      }
      else
      {
         push @label_cache, $label1; 
      }
   }

   # Begin display.  Show the first line of the block.
   say "$last_label\t$last_num1\t$last_num2\t$last_label2";

   # Calculate the slope for interpolation: (last - first) / difference
   my $slope1 = ($num1 - $last_num1) / (@label_cache + 1);
   my $slope2 = ($num2 - $last_num2) / (@label_cache + 1);
   my $distance = 0;

   # Display each label and the lines inside.
   foreach my $label (@label_cache)
   {
      ++$distance;
      say $label, "\t",
          $slope1 * $distance + $last_num1, "\t",
          $slope2 * $distance + $last_num2;
   }

   # Display the final line in the block.
   say "$label1\t$num1\t$num2\t$label2";

   # Not done yet, so return a 'false' value.
   return 0;
}

# Main part of the script

my $done = 0;
while (! $done)
{
   $done = read_block();
}


__DATA__
a   3   4   end
e
f
g
h
i
k   15  26  start
k   15  26  end
o
p
q
r
s   3   5   start
s   3   5   end
v
w
x
y   14  16  start

emits:

a       3       4       end
e       5       7.66666666666667
f       7       11.3333333333333
g       9       15
h       11      18.6666666666667
i       13      22.3333333333333
k       15      26      start
k       15      26      end
o       12.6    21.8
p       10.2    17.6
q       7.8     13.4
r       5.4     9.2
s       3       5       start
s       3       5       end
v       5.75    7.75
w       8.5     10.5
x       11.25   13.25
y       14      16      start

You could then, of course, do whatever kind of number rounding or formatting that you needed. :)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!