awk script- extract text between parenthesis

匿名 (未验证) 提交于 2019-12-03 02:39:01

问题:

I am tring to extract text between the first and its matching closing bracket in a file.

Input

CREATE MULTISET TABLE ABCD.EFGH, NO FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = Default ( ABCK_SK      INTEGER         NOT NULL, PRQ  VARCHAR(1024)           NOT NULL, RST   DECIMAL (12,4)          NOT NULL, LMN     CHAR(1)         NOT NULL, OPQ      DATE            NOT NULL, PQRS     DATE            NOT NULL, TUV       INTEGER         NOT NULL, WXY        INTEGER         NOT NULL )  UNIQUE PRIMARY INDEX ABCK_PI (ABCK_SK) ;

Expected Output

ABCK_SK      INTEGER         NOT NULL, PRQ  VARCHAR(1024)           NOT NULL, RST   DECIMAL (12,4)          NOT NULL, LMN     CHAR(1)         NOT NULL, OPQ      DATE            NOT NULL, PQRS     DATE            NOT NULL, TUV       INTEGER         NOT NULL, WXY        INTEGER         NOT NULL

I have written the following script for getting the line number and column number from where to where the text needs to be extracted, but i not able to actually print the output. Any suggetions would be greatly appreciated. Thanks

#!/bin/sh nawk 'BEGIN{startln=0;j=0;i=0;endln=0;startchr=0;endchr=0} { i=1; while( i<=NF ) { if($i=="(" && startln==0 ){startchr=i; startln=NR}  if($i==")"){j=j-1}  if($i=="("){j=j+1}  if(j==0){endchr=i;endln=NR;break}   i=i+1}}  END{ print "startln="startln " startchr="startchr " endln="endln " endchr="endchr}' $1

回答1:

A perl solution:

perl -e '$/=\1;     while(<>) {         if( /\)/ ) { $c -=1; exit unless $c }         print if $c > 0;         $c += /\(/     }' input-file


回答2:

Usage:
awk -f foo.awk foo.txt

foo.awk

BEGIN {     ORS=""     RS="[()]" }  RT=="(" {     s++;     if (s>1) print $0 RT }   RT==")" {     s--;     if (s==0) {       print $0 "\n"       exit     } else {       print $0 RT     } }


回答3:

Here's a nice way to extract the data contained in the first matching parentheses:

sed -n -e '1,/(/s/[^(]*/foo/' -e '/(/,$p' input-file | m4 -D 'foo=$* m4exit(0)'

The sed replaces all text prior to the first open paren with the text foo, and then uses m4 with a macro named foo defined that simply outputs its first argument and then discards the remaining data. m4 has pretty robust parsing of parenthesis, so this should work for most cases. (note that this will fail if your enclosed text contains the string foo followed by a (. Choose some unique string other than foo.)



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!