How does \b work when using regular expressions?

前端未结

关注

 7  954

If I have a sentence and I wish to display a word or all words after a particular word has been matched ahead of it, for example I would like to display the word fox


                      
              相关标签:


      
      
        
          7条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  轮回少年        
                
              
                            
                2020-12-03 14:33
              
            
            
                                                                       
A word boundary is a position that is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one.  It's equivalent to this:

(?<=\w)(?!\w)|(?=\w)(?<!\w)


...or it's supposed to be.  See this question for everything you ever wanted to know about word boundaries. ;)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  误落风尘        
                
              
                            
                2020-12-03 14:36
              
            
            
                                                                       
\b is a zero width match of a word boundary.

(Either start of end of a word, where "word" is defined as \w+)

Note: "zero width" means if the \b is within a regex that matches, it does not add any characters to the text captured by that match. ie the regex \bfoo\b when matched will capture just "foo" - although the \b contributed to the way that foo was matched (ie as a whole word), it didn't contribute any characters. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  闹比i        
                
              
                            
                2020-12-03 14:38
              
            
            
                                                                       
\b is a zero width assertion. That means it does not match a character, it matches a position with one thing on the left side and another thing on the right side.
The word boundary \b matches on a change from a \w (a word character) to a \W a non word character, or from \W to \w
Which characters are included in \w depends on your language. At least there are all ASCII letters, all ASCII numbers and the underscore. If your regex engine supports unicode, it could be that there are all letters and numbers in \w that have the unicode property letter or number.
\W are all characters, that are  NOT in \w.
\bbrown\s

will match here
The quick brown fox
         ^^

but not here
The quick bbbbrown fox

because between b and brown is no word boundary, i.e. no change from a non word character to a word character, both characters are included in \w.
If your regex comes to a \b it goes on to the next char, thats the b from brown. Now the \b know's whats on the right side, a word char ==> the b. But now it needs to look back, to let the \b become TRUE, there needs to be a non word character before the b. If there is a space (thats not in \w) then the \b before the b is true. BUT if there is another b then its false and then \bbrown does not match "bbrown"
The regex brown would match both strings "quick brown" and "bbrown", where the regex \bbrown matches only "quick brown" AND NOT "bbrown"
For more details see here on www.regular-expressions.info
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长发绾君心        
                
              
                            
                2020-12-03 14:40
              
            
            
                                                                       
\b guarantees that brown is on a word boundary effectively excluding patterns like

blackandbrown
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  臣服心动        
                
              
                            
                2020-12-03 14:46
              
            
            
                                                                       
\b is a "word boundary" and is the position between the start or end of a word and then "non-word" characters.

Its main use is to simplify the selection of a whole word to \bbrown\s will match:

^brown 
 brown 
99brown 
_brown 

Its more or less equivalent to "\W*" except when "capturing" strings as "\b" matches the start of the word rather than the non-word character preceding or following the word.  
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南笙        
                
              
                            
                2020-12-03 14:52
              
            
            
                                                                       
You don't need a look behind, you can simply use:

(\bbrown\s)(\w+)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复