What is different about these two pairs of strings that makes this sed script with one and not the other?

一曲冷凌霜 提交于 2019-12-11 10:58:54

问题


This question is related to this other question I asked earlier today: Find and replace text with all-inclusive wild card

I have a text file like this

I want= to keep this
        This is some <text> I want to keep <and "something" in tags that I" want to keep> aff FOO1 WebServices and some more "text" that" should "</be> </deleted>
        <this is stuff in tags I want=to begone> and other text I want gone too. </this is stuff in tags I want to begone> 
       A novice programmer walked into a "BAR2" descript keepthis
        and this even more text, let's keep it
    <I actually want this>
    and this= too.`

when I use sed -f script.sed file.txt to run this script:

# Check for "aff"
/\baff\b/    {   
# Define a label "a"
:a  
# If the line does not contain "desc"
/\bdesc\b/!{
# Get the next line of input and append
    # it to the pattern buffer
    N
    # Branch back to label "a"
    ba
}   
# Replace everything between aff and desc
s/\(\baff\)\b.*\b\(desc\b\)/\1TEST DATA\2/
}

I get this as my output:

       I want= to keep this
        This is some <text> I want to keep <and "something" in tags that I" want to keep> aff FOO1 WebServices and some more "text" that" should "</be> </deleted>
        <this is stuff in tags I want=to begone> and other text I want gone too. </this is stuff in tags I want to begone> 
       A novice programmer walked into a "BAR2" descript keepthis
        and this even more text, let's keep it
    <I actually want this>
    and this= too.

However, by simply changing the search strings from aff and desc to FOO1 and BAR2:

   # Check for "FOO1"
/\bFOO1\b/    {   
# Define a label "a"
:a  
# If the line does not contain "BAR2"
/\bBAR2\b/!{
# Get the next line of input and append
    # it to the pattern buffer
    N
    # Branch back to label "a"
    ba
}   
# Replace everything between FOO1 and BAR2
s/\(\bFOO1\)\b.*\b\(BAR2\b\)/\1TEST DATA\2/
}

gives the expected output:

I want= to keep this
This is some <text> I want to keep <and "something" in tags that I" want to keep> aff FOO1TEST DATABAR2" descript keepthis
    and this even more text, let's keep it
<I actually want this>
and this= too.`

I am completely stumped about what is going on here. Why should searching between FOO1 and BAR2 work differently from the exact same script with aff and desc?


回答1:


The end marker should be \bdesc instead of \bdesc\b.

Note the \b in the pattern, it matches a word boundary. Your above text contains the word description, but not desc.

Your previous question made me assume that you want that. If you don't care about word boundaries, remove the \b escape sequences completely.



来源:https://stackoverflow.com/questions/32174477/what-is-different-about-these-two-pairs-of-strings-that-makes-this-sed-script-wi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!