TCL_REGEXP::How to grep words from tcl variable and put into a text file, seperated with comma?

家住魔仙堡 提交于 2019-12-04 21:09:34
Jerry

You can capture the URLs and from the result list then join on commas. A simplistic approach would be like...

set urls [list]
foreach {dummy item} [regexp -all -inline {Server Hello ServerName\s+(\S+)} $line] {
    lappend urls $item
}
set urls [join $urls ,]

Though if there can be commas in the urls, you might add quotes and escape any inherent quotes in there too...

set urls [list]
foreach {dummy item} [regexp -all -inline {Server Hello ServerName\s+(\S+)} $line] {
    lappend urls \"[string map {{"} {\"}} $item]\"
}
set urls [join $urls ,]

The string map will escape any quotes with a backslash here.

You might instead use tabs instead of commas to avoid these:

set urls [list]
foreach {dummy item} [regexp -all -inline {Server Hello ServerName\s+(\S+)} $line] {
    lappend urls $item
}
set urls [join $urls \t]

EDIT: From chat, here's the full code with all the other different implications and using a modified version of Donal's regexp:

set line { 
Jul 24 21:06:40 2014: %AUTH-6-INFO: login[1765]: user 'admin' on 'pts/1' logged
Jul 24 21:05:15 2014: %DATAPLANE-5-: Unrecognized HTTP URL www.58.net. Flow: 0x2
Jul 24 21:04:39 2014: %DATAPLANE-5-: Unrecognized HTTP URL static.58.com. Flow:
Jul 24 21:04:38 2014: %DATAPLANE-5-: Unrecognized HTTP URL www.google-analytics.
com. Flow: 0x2265394048.
Jul 24 21:04:36 2014: %DATAPLANE-5-: Unrecognized HTTP URL track.58.co.in. Flow: 0
Jul 24 21:04:38 2014: %DATAPLANE-5-:Unrecognized HTTP URL www.google.co.in. Flow: 0x87078800
Jul 24 21:04:38 2014: %DATAPLANE-5-:CCB:44:Unrecognized Client Hello ServerName www.google.co.in. Flow: 0x87073880. len_analyzed: 183
Jul 24 21:04:38 2014: %DATAPLANE-5-:CCB:44:Unrecognized Server Hello ServerName test1. Flow: 0x87073880, len_analyzed 99
Jul 24 21:04:38 2014: %DATAPLANE-5-:CCB:44:Unrecognized Server Cert CommonName *.google.com. Flow: 0x87073880
Jul 24 21:04:38 2014: %DATAPLANE-5-:CCB:44:Searching rname(TYPE_A) cs50.wac.edgecastcdn.net in dns_hash_table
Jul 24 21:04:38 2014: %DATAPLANE-5-:Unrecognized HTTP URL www.facebook.com. Flow: 0x87078800
Jul 24 21:04:38 2014: %DATAPLANE-5-:CCB:44:Unrecognized Client Hello ServerName www.fb.com. Flow: 0x87073880. len_analyzed: 183
Jul 24 21:05:38 2014: %DATAPLANE-5-:CCB:44:Unrecognized Server Hello ServerName test. Flow: 0x87073880, len_analyzed 99
Jul 24 21:04:38 2014: %DATAPLANE-5-:CCB:44:Unrecognized Server Cert CommonName *.facebook.com. Flow: 0x87073880
Jul 24 21:05:39 2014: %DATAPLANE-5-:CCB:44:Searching rname(TYPE_A) cs50.wac.facebook.net in dns_hash_table
}

set URL [list]
set chs [list]
set shs [list]
set scs [list]
set rname [list]
set cURL 0
set cchs 0
set cshs 0
set cscs 0
set crname 0
foreach {whole type payload} [regexp -all -inline {(?x)
    \y ( URL
      | (?: Client | Server)[ ]Hello[ ]ServerName
      | Server[ ]Cert[ ]CommonName
      | rname\([^)]+\) )
    \s+ ((?:(?![ ]Flow:| in[ ]dns_hash_table).)+)
} $line] {
    switch -regexp $type {
        URL {lappend URL $payload; incr cURL}
        {Client Hello ServerName} {lappend chs $payload; incr cchs}
        {Server Hello ServerName} {lappend shs $payload; incr cshs}
        {Server Cert CommonName} {lappend scs $payload; incr cscs}
        {rname\([^)]+\)} {lappend rname $payload; incr crname}
    }
}

set max [lindex [lsort -decreasing [list $cURL $cchs $cshs $cscs $crname]] 0]
set i 0
set all_list [list]

while {$max != $i} {
    if {[catch {regsub -all {\s} [lindex $URL $i] "" one}]} {set one ""}
    if {[catch {regsub -all {\s} [lindex $chs $i] "" two}]} {set two ""}
    if {[catch {regsub -all {\s} [lindex $shs $i] "" three}]} {set three ""}
    if {[catch {regsub -all {\s} [lindex $scs $i] "" four}]} {set four ""}
    if {[catch {regsub -all {\s} [lindex $rname $i] "" five}]} {set five ""}
    lappend all_list [join [list $one $two $three $four $five] ,]
    incr i
}
puts [join $all_list \n]

ideone demo

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!