Using grep to get the next WORD after a match in each line

前端未结

关注

 6  1333

I want to get the \"GET\" queries from my server logs.

For example, this is the server log

1.0.0.127.in-addr.arpa - - [10/Jun/2012


                      
              相关标签:


      
      
        
          6条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2020-11-30 07:21
              
            
            
                                                                       
It's often easier to use a pipeline rather than a single complex regular expression. This works on the data you provided:

fgrep GET /tmp/foo | 
    egrep -o 'GET (.*) HTTP' |
    sed -r 's/^GET \/(.+) HTTP/\1/'


This pipeline returns the following results:

hello
ss


There are certainly other ways to get the job done, but this patently works on the provided corpus.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2020-11-30 07:23
              
            
            
                                                                       
In this case since the log file has a known structure, one option is to use cut to pull out the 7th column (fields are denoted by tabs by default).

grep GET log.txt | cut -f 7 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长情又很酷        
                
              
                            
                2020-11-30 07:29
              
            
            
                                                                       
I was trying to do this and came across this link: https://www.unix.com/shell-programming-and-scripting/153101-print-next-word-after-found-pattern.html

Summary:
use grep to find matching lines, then use awk to find the pattern and print the next field:

grep pattern logfile | \
  awk '{for(i=1; i<=NF; i++) if($i~/pattern/) print $(i+1)}'


If you want to know the unique occurrences:

grep pattern logfile | \
  awk '{for(i=1; i<=NF; i++) if($i~/pattern/) print $(i+1)}' | \
  sort | \
  uniq -c

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘掉有多难        
                
              
                            
                2020-11-30 07:31
              
            
            
                                                                       
Assuming you have gnu grep, you can use perl-style regex to do a positive lookbehind:

grep -oP '(?<=GET\s/)\w+' file


If you don't have gnu grep, then I'd advise just using sed:

sed -n '/^.*GET[[:space:]]\{1,\}\/\([-_[:alnum:]]\{1,\}\).*$/s//\1/p' file


If you happen to have gnu sed, that can be greatly simplified:

sed -n '/^.*GET\s\+\/\(\w\+\).*$/s//\1/p' file


The bottom line here is, you certainly don't need pipes to accomplish this.  grep or sed alone will suffice.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  终归单人心        
                
              
                            
                2020-11-30 07:38
              
            
            
                                                                       
gawk '{match($7,/\/(\w+)/,a);} length(a[1]){print a[1]}' log.txt
hello
ss


If you have gawk then above command will use match function to select the desired value using regex and storing it to an array a. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一整个雨季        
                
              
                            
                2020-11-30 07:40
              
            
            
                                                                       
use a pipe if you use grep:

grep -o /he.* log.txt | grep -o [^/].*
grep -o /ss log.txt | grep -o [^/].*


[^/] means extract the letters after ^ symbol from the grep output
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复