Split string into sentences using regex

前端未结

关注

 6  1942

挽巷 2020-11-28 10:54

I have random text stored in $sentences. Using regex, I want to split the text into sentences, see:

function splitSentences($text) {
    $re = \


      
      
        
          6条回答        

        
                    
            
            
                         
                
              
              
                
                   离开以前
                                             
                
                
                (楼主)
            
              
              
                2020-11-28 11:37
              

            
            
                        
If spaces are unreliable, than you could use match on a . followed by any number of spaces, followed by a capital letter.

You can match any capital UTF-8 letter using the Unicode character property \p{Lu}.

You only need to exclude abbreviations which tend to follow own names (person names, company names, etc), since they start with a capital letter.

function splitSentences($text) {
    $re = '/                # Split sentences ending with a dot
        .+?                 # Match everything before, until we find
        (
          $ |               # the end of the string, or
          \.                # a dot
          (?


Note: This answer might not be accurate enough for your situation. I'm unable to judge that. It does address the problem as described above and is easily understandable.
    
             
                                                        
            

            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它6个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          

                              			
        

        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复