Pick a record from file based on last updated date for a particular event

不想你离开。 提交于 2021-01-28 06:06:11

问题


I have list alarm data, where each alarm will be identified with unique key called eventId.

For each of the unique eventId I should pick the latest operation performed on the alarm.

Latest action performed can be identified by looking into the lastUpdated field of the alarm.

Here in below example, I have two alarms data with eventId 1001 and 1002, and list operations performed on those particular alarms.

Below is the code:

use strict;
use warnings;

no warnings 'uninitialized';

use Data::Dumper;
use feature 'say';

my %selected;
{
   local $/ = "";
   while ( my $row = <DATA> ) {
      
      next if ($row =~ /Total number of Alarm/);
      
      my %row =
         map { split / : /, $_, 2 }
            split /\n/, $row;

      my $id = $row{eventId};
      $selected{$id} = \%row
         if $row gt $selected{lastUpdated}
         || $row eq $selected{lastUpdated} && $row{Severity} eq 'CLEARED';
   }
}

say "EventID,Node,Severity,State,LastUpdate";

foreach my $key (keys %selected){
    my $e_id = $selected{$key};
    say "$e_id->{eventId},$e_id->{Node},$e_id->{Severity},$e_id->{State},$e_id->{lastUpdated}";
}

__DATA__
Severity : HIGH
Node : N01
eventId : 1001
State : ACTIVE_UNACKNOWLEDGED
lastUpdated : 2020-09-30T12:30:32

Severity : HIGH
Node : N01
eventId : 1001
State : ACTIVE_UNACKNOWLEDGED
lastUpdated : 2020-09-30T12:35:33

Severity : CLEARED
Node : N01
eventId : 1001
State : CLEARED_ACKNOWLEDGED
lastUpdated : 2020-09-30T12:43:11

Severity : CLEARED
Node : N01
eventId : 1001
State : CLEARED_UNACKNOWLEDGED
lastUpdated : 2020-09-30T12:35:33

Severity : MEDIUM
Node : N02
eventId : 1002
State : ACTIVE_UNACKNOWLEDGED
lastUpdated : 2020-09-30T12:40:00

Severity : HIGH
Node : N02
eventId : 1002
State : ACTIVE_UNACKNOWLEDGED
lastUpdated : 2020-09-30T12:45:00

Total number of Alarm(s): 2

Output from the above script:

EventID,Node,Severity,State,LastUpdate
1001,N01,CLEARED,CLEARED_UNACKNOWLEDGED,2020-09-30T12:35:33
1002,N02,HIGH,ACTIVE_UNACKNOWLEDGED,2020-09-30T12:45:00

Expected Output:

EventID,Node,Severity,State,LastUpdate
1001,N01,CLEARED,CLEARED_ACKNOWLEDGED,2020-09-30T12:43:11
1002,N02,HIGH,ACTIVE_UNACKNOWLEDGED,2020-09-30T12:45:00

Here the condition is, if particular eventId have latest date in lastUpdated field and if CLEARED severity exists, then this record needs to be considered/printed. Else if CLEARED severity doesn't exists for particular eventId then it should consider the one has latest date in lastUpdated field. And if the lastUpdated is same for CLEARED/HIGH/MEDIUM/LOW severity, then it should pick the record which has CLEARED severity.

So, for event 1001 we have latest date as 2020-09-30T12:43:11, but it is picking 2020-09-30T12:35:33 which is wrong. Because its the last record for that particular event.

How can I pick the particular events record based on lastUpdated and also satisfying the severity condition mentioned above.


回答1:


I think you do not quite understand what you have in your various variables. For example, when you do

if $row gt $selected{lastUpdated}

You are basically doing this:

if "Severity : HIGH
Node : N01
eventId : 1001
State : ACTIVE_UNACKNOWLEDGED
lastUpdated : 2020-09-30T12:30:32" gt "2020-09-30T12:45:00";

The $row variable comes from your readline statement in the while condition:

while ( my $row = <DATA> ) {

Where it reads a whole paragraph of text from the file (because you are using paragraph mode $/ = ""). A paragraph ends with double newlines \n\n or end of file.

Your best option is to read the file into memory, and then do your comparisons. This sample code will show you how to organize your data into a suitable Perl data structure. I will leave the logic of finding out how to compare the values to you.

use strict;
use warnings;
use Data::Dumper;

my %log;
local $/ = "";    # paragraph mode

while (<>) {
    next if /^Total number/;
    my (%data) = split / : |\n/;
    push @{$log{$data{eventId}}}, \%data;
}

print Dumper \%log

With your data, this gives me the output:

$VAR1 = {
          '1001' => [
                      {
                        'eventId' => '1001',
                        'lastUpdated' => '2020-09-30T12:30:32',
                        'State' => 'ACTIVE_UNACKNOWLEDGED',
                        'Node' => 'N01',
                        'Severity' => 'HIGH'
                      },
                      {
                        'Severity' => 'HIGH',
                        'Node' => 'N01',
                        'State' => 'ACTIVE_UNACKNOWLEDGED',
                        'eventId' => '1001',
                        'lastUpdated' => '2020-09-30T12:35:33'
                      },
                      {
                        'eventId' => '1001',
                        'lastUpdated' => '2020-09-30T12:43:11',
                        'Severity' => 'CLEARED',
                        'State' => 'CLEARED_ACKNOWLEDGED',
                        'Node' => 'N01'
                      },
                      {
                        'Node' => 'N01',
                        'State' => 'CLEARED_UNACKNOWLEDGED',
                        'Severity' => 'CLEARED',
                        'eventId' => '1001',
                        'lastUpdated' => '2020-09-30T12:35:33'
                      }
                    ],
          '1002' => [
                      {
                        'lastUpdated' => '2020-09-30T12:40:00',
                        'eventId' => '1002',
                        'Severity' => 'MEDIUM',
                        'State' => 'ACTIVE_UNACKNOWLEDGED',
                        'Node' => 'N02'
                      },
                      {
                        'lastUpdated' => '2020-09-30T12:45:00',
                        'eventId' => '1002',
                        'Node' => 'N02',
                        'State' => 'ACTIVE_UNACKNOWLEDGED',
                        'Severity' => 'HIGH'
                      }
                    ]
        };

This is a hash where each unique eventId has an array of records attached to it. With this structure, you can easily loop around the various eventId values to find the highest/lowest lastUpdated or Severity. For example you can sort the eventIds based on lastUpdated using cmp (string-wise greatness):

for my $key (sort { $a <=> $b } keys %log) {
    print "eventId: $key\n";
    for my $record (sort { $a->{lastUpdated} cmp $b->{lastUpdated} } @{ $log{$key} }) {
                       # this will sort the hashes inside the array based on lastUpdated
        print "\t$record->{lastUpdated}";
        if ($record->{Severity} eq 'CLEARED') {
            print "\tCLEARED";
        }
        print "\n";
    }
}

For the data you provided, this will print

eventId: 1001
        2020-09-30T12:30:32
        2020-09-30T12:35:33
        2020-09-30T12:35:33     CLEARED
        2020-09-30T12:43:11     CLEARED
eventId: 1002
        2020-09-30T12:40:00
        2020-09-30T12:45:00

Since I am not clear on what your logic is exactly, I will leave those details to you.




回答2:


My solution:

use strict;
use warnings;
use Data::Dumper;

local $/ = "\n\n";
my $events = {};

while( <DATA> ) {
    my $hr = {};
    while( /(\w+) *: *(.+)\n/g ) {
        $hr->{$1} = $2;
    }

    if( exists $events->{ $hr->{eventId} } ) {
        if( $events->{ $hr->{eventId} }->{lastUpdated} lt $hr->{lastUpdated} ) {
            $events->{ $hr->{eventId} } = $hr;
        }
    }
    elsif( $hr->{eventId} )  {
        $events->{ $hr->{eventId} } = $hr;
    }

}

print Dumper($events), "\n";


来源:https://stackoverflow.com/questions/64155295/pick-a-record-from-file-based-on-last-updated-date-for-a-particular-event

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!