问题:

I am tring to extract text between the first and its matching closing bracket in a file.

Input

CREATE MULTISET TABLE ABCD.EFGH, NO FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = Default ( ABCK_SK      INTEGER         NOT NULL, PRQ  VARCHAR(1024)           NOT NULL, RST   DECIMAL (12,4)          NOT NULL, LMN     CHAR(1)         NOT NULL, OPQ      DATE            NOT NULL, PQRS     DATE            NOT NULL, TUV       INTEGER         NOT NULL, WXY        INTEGER         NOT NULL )  UNIQUE PRIMARY INDEX ABCK_PI (ABCK_SK) ;

Expected Output

ABCK_SK      INTEGER         NOT NULL, PRQ  VARCHAR(1024)           NOT NULL, RST   DECIMAL (12,4)          NOT NULL, LMN     CHAR(1)         NOT NULL, OPQ      DATE            NOT NULL, PQRS     DATE            NOT NULL, TUV       INTEGER         NOT NULL, WXY        INTEGER         NOT NULL

I have written the following script for getting the line number and column number from where to where the text needs to be extracted, but i not able to actually print the output. Any suggetions would be greatly appreciated. Thanks

#!/bin/sh nawk 'BEGIN{startln=0;j=0;i=0;endln=0;startchr=0;endchr=0} { i=1; while( i<=NF ) { if($i=="(" && startln==0 ){startchr=i; startln=NR}  if($i==")"){j=j-1}  if($i=="("){j=j+1}  if(j==0){endchr=i;endln=NR;break}   i=i+1}}  END{ print "startln="startln " startchr="startchr " endln="endln " endchr="endchr}' $1

回答1:

A perl solution:

perl -e '$/=\1;     while(<>) {         if( /\)/ ) { $c -=1; exit unless $c }         print if $c > 0;         $c += /\(/     }' input-file

回答2:

Usage:
awk -f foo.awk foo.txt

foo.awk

BEGIN {     ORS=""     RS="[()]" }  RT=="(" {     s++;     if (s>1) print $0 RT }   RT==")" {     s--;     if (s==0) {       print $0 "\n"       exit     } else {       print $0 RT     } }

回答3:

Here's a nice way to extract the data contained in the first matching parentheses:

sed -n -e '1,/(/s/[^(]*/foo/' -e '/(/,$p' input-file | m4 -D 'foo=$* m4exit(0)'

The sed replaces all text prior to the first open paren with the text foo, and then uses m4 with a macro named foo defined that simply outputs its first argument and then discards the remaining data. m4 has pretty robust parsing of parenthesis, so this should work for most cases. (note that this will fail if your enclosed text contains the string foo followed by a (. Choose some unique string other than foo.)

转载请标明出处:awk script- extract text between parenthesis

文章来源: awk script- extract text between parenthesis

标签

awk

脚本