How to compare and substitute strings in different lines in unix

前端 未结 3 1830
小蘑菇
小蘑菇 2020-12-10 23:41

I want to compare and substitute strings present in different lines in unix

For example I have a file with two words in each line

          


        
相关标签:
3条回答
  • 2020-12-11 00:04

    This is VERY clearly a case for a recursive descent solution:

    $ cat tst.awk
    function descend(node) {return (map[node] in map ? descend(map[node]) : map[node])}
    { map[$1] = $2 }
    END { for (key in map) print key, descend(key) }
    
    $ awk -f tst.awk file
    <a> <e>
    <b> <e>
    <c> <e>
    <d> <e>
    

    If infinite recursion in your input is a possibility, here;s an approach that will print as the 2nd field the last node before the recursion starts and put a "*" next to it so you know it's happening:

    $ cat tst.awk
    function descend(node,  child, descendant) {
        stack[node]
        child = map[node]
        if (child in map) {
            if (child in stack) {
                descendant = node "*"
            }
            else {
                descendant = descend(child)
            }
        }
        else {
            descendant = child
        }
        delete stack[node]
        return descendant
    }
    { map[$1] = $2 }
    END { for (key in map) print key, descend(key) }
    

    .

    $ cat file
    <w> <w>
    <x> <y>
    <y> <z>
    <z> <x>
    <a> <b>
    <d> <e>
    <b> <c>
    <c> <e>
    
    $ awk -f tst.awk file
    <w> <w>*
    <x> <z>*
    <y> <x>*
    <z> <y>*
    <a> <e>
    <b> <e>
    <c> <e>
    <d> <e>
    

    If you need the output order to match the input order and/or or to print duplicate lines twice, change the bottom 2 lines of the script to:

    { keys[++numKeys] = $1; map[$1] = $2 }
    END {
        for (keyNr=1; keyNr<=numKeys; keyNr++) {
            key = keys[keyNr]
            print key, descend(key)
        }
    }
    
    0 讨论(0)
  • 2020-12-11 00:08

    Perl to the rescue:

    #!/usr/bin/perl
    use warnings;
    use strict;
    
    my (@buff);
    sub output {
        my $last = pop @buff;
        print map "$_ $last\n", @buff;
        @buff = ();
    }
    
    while (<>) {
        my @F = split;
        output() if @buff and $F[0] ne $buff[-1]; # End of a group.
        push @buff, $F[0] unless @buff;           # Start a new group.
        push @buff, $F[1];
    }
    
    output();                                     # Don't forget to print the last buffer.
    

    Explanation: Read the input line by line. Keep a list of words to be printed with the same second word. If the first word is different than the second word of the previous line, print the buffered output.

    0 讨论(0)
  • 2020-12-11 00:20
    awk '{i++;a[i]=$1;b[i]=$2;next}
          END{
                for(i=1;i in a;i++)
                {
                  f=1;
                  while (f==1)
                  {
                    f=0;
                    for(j=i+1;j in a;j++)
                    {
                      if(b[i]==a[j])
                      {
                        b[i]=b[j];
                        f=1;
                      }
                    }
                  }
                }
                for(i=1;i in a;i++)
                {
                  print a[i],b[i];
                }
              }' input.txt
    

    Input:

    <a> <b>
    <d> <e>
    <b> <c>
    <c> <e>
    

    Output:

    <a> <e>
    <d> <e>
    <b> <e>
    <c> <e>
    

    Input:

    <a> <b>
    <e> <z>
    <b> <e>
    

    Output:

    <a> <z>
    <e> <z>
    <b> <e>
    


    EDIT

    If you need to get

    <a> <z>
    <e> <z>
    <b> <z>
    

    As output from the second input you can change this line:

    if(b[i]==a[j])
    

    to:

    if(j!=i&&b[i]==a[j])
    

    and this:

    for(j=i+1;j in a;j++)
    

    to:

    for(j=1;j in a;j++)
    

    Also note that this code assumes there is not a case where second word of a line is equal to both first word of a line and its second word i.e:

    <a> <b>
    <e> <z>
    <b> <b>
    

    In that case the execution of the code will never ends.

    0 讨论(0)
提交回复
热议问题