Awk date validation

↘锁芯ラ 提交于 2021-01-28 05:13:44

问题


I have an awk script where I need to validate a large number of lines containing dates.

I'm currently using either a regex based solution to do a basic validation (without testing for leap years or ) or calling the UNIX date command to validate it more accurately. The date command works well, but calling a system command is pretty expensive in terms of performance.

I was hoping that someone here might be able to suggest a solution that is both accurate and is fast.

Here's an example of my data

20140804024614
20140803190020
20140803163320
20140803083222
20140803170321
20140803234044
20140804011857
20140803204008
20140803160026
20140803140120

Thanks.


回答1:


Given a whole lot of assumptions about your input file, this is probably all you need to print only the valid dates+times using GNU awk for time functions and gensub():

awk 'strftime("%Y%m%d%H%M%S",mktime(gensub(/(.{4})(..)(..)(..)(..)/,"\\1 \\2 \\3 \\4 \\5 ",""))) == $0' file

It will only work with dates since the epoch.

If you need to print some kind of "valid/invalid" message for each date/time:

$ cat file
20140230035900
20140804024614
$
$ awk '{print (strftime("%Y%m%d%H%M%S",mktime(gensub(/(.{4})(..)(..)(..)(..)/,"\\1 \\2 \\3 \\4 \\5 ",""))) == $0 ? "" : "in") "valid:", $0}' file
invalid: 20140230035900
valid: 20140804024614

The above works by converting the date+time to seconds since the epoch, then converting those seconds to a date+time in the original format and if the result is identical to what you started with then the original date was valid.




回答2:


Check this:

checkFormat ()
{
dateV="${1}"

echo "${dateV}"|gawk  '{
   if (match($0,/^((?:19|20)[0-9][0-9])(0[1-9]|1[012])(0[1-9]|[12][0-9]|3[01])([01][0-9]|2[0-4])$/,a)) {
      year=a[1]+0
      mon=a[3]+0
      day=a[4]+0
      hour=a[5]+0
      }
   else {
       print "KO: "$0
       exit
     }

   if (day == 31 && (mon == 4 ||  mon == 6 || mon == 9 || mon == 11))
      print "KO: "$0 # 30 days months
   else if (day >= 30 && mon == 2)
      print "KO: "$0 # Febrary never 30 o 31
   else if (mon == 2 && day == 29 && ! (  year % 4 == 0 && (year % 100 != 0 || year % 400 == 0)))
      print "KO: "$0 # Febrary  29 leap year
   else
      print "Correct date !:" $0
   }'

}


checkFormat 2014080417
checkFormat 20140803190035

Usage:

$ ./checker.sh 
Correct date !:2014080417
KO: 20140803190035

NOTE: MINUTES and SECONDS will be your task :)

Check also: http://nixtip.wordpress.com/2011/11/28/an-awk-date-format-validator/



来源:https://stackoverflow.com/questions/26761659/awk-date-validation

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!