Split string by space and character as delimiter in Oracle with regexp_substr

前端未结

关注

 5  2024

I\'m trying to split a string with regexp_subtr, but i can\'t make it work.

So, first, i have this query

select regexp_substr(\'Helloworld - test!\'


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  被撕碎了的回忆        
                
              
                            
                2020-12-04 00:44
              
            
            
                                                                       
Slight improvement on MT0's answer.  Dynamic count using regexp_count and proves it handles nulls where the format of [^delimiter]+ as a pattern does NOT handle NULL list elements. More info on that here: Split comma seperated values to columns

SQL> with tbl(str) as (
  2    select ' - Hello world - test-test! -  - test - ' from dual
  3  )
  4  SELECT LEVEL AS Occurrence,
  5         REGEXP_SUBSTR( str ,'(.*?)([[:space:]]-[[:space:]]|$)', 1, LEVEL, NULL, 1 ) AS split_value
  6  FROM   tbl
  7  CONNECT BY LEVEL <= regexp_count(str, '[[:space:]]-[[:space:]]')+1;

OCCURRENCE SPLIT_VALUE
---------- ----------------------------------------
         1
         2 Hello world
         3 test-test!
         4
         5 test
         6

6 rows selected.

SQL>

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情话喂你        
                
              
                            
                2020-12-04 00:50
              
            
            
                                                                       
SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE TEST( str ) AS
          SELECT 'Hello world - test-test! - test' FROM DUAL
UNION ALL SELECT 'Hello world2 - test2 - test-test2' FROM DUAL;


Query 1:

SELECT Str,
       COLUMN_VALUE AS Occurrence,
       REGEXP_SUBSTR( str ,'(.*?)([[:space:]]-[[:space:]]|$)', 1, COLUMN_VALUE, NULL, 1 ) AS split_value
FROM   TEST,
       TABLE(
         CAST(
           MULTISET(
             SELECT LEVEL
             FROM   DUAL
             CONNECT BY LEVEL < REGEXP_COUNT( str ,'(.*?)([[:space:]]-[[:space:]]|$)' )
           )
           AS SYS.ODCINUMBERLIST
         )
       )


Results:

|                               STR | OCCURRENCE |  SPLIT_VALUE |
|-----------------------------------|------------|--------------|
|   Hello world - test-test! - test |          1 |  Hello world |
|   Hello world - test-test! - test |          2 |   test-test! |
|   Hello world - test-test! - test |          3 |         test |
| Hello world2 - test2 - test-test2 |          1 | Hello world2 |
| Hello world2 - test2 - test-test2 |          2 |        test2 |
| Hello world2 - test2 - test-test2 |          3 |   test-test2 |

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  鱼传尺愫        
                
              
                            
                2020-12-04 01:00
              
            
            
                                                                       
If i understood correctly, this will help you. Currently you are getting output as Helloworld(with space at the end). So i assume u don't want to have space at the end. If so you can simply use the space in the delimiter also like.

select regexp_substr('Helloworld - test!' ,'[^ - ]+',1,1)from dual;

OUTPUT
Helloworld(No space at the end)


As u mentioned in ur comment if u want two columns output with Helloworld and test!. you can do the following.

select regexp_substr('Helloworld - test!' ,'[^ - ]+',1,1),
       regexp_substr('Helloworld - test!' ,'[^ - ]+',1,3) from dual;

OUTPUT
col1         col2
Helloworld   test!

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  自闭症患者        
                
              
                            
                2020-12-04 01:01
              
            
            
                                                                       
CREATE OR REPLACE FUNCTION field(i_string            VARCHAR2
                                ,i_delimiter         VARCHAR2
                                ,i_occurance         NUMBER
                                ,i_return_number     NUMBER DEFAULT 0
                                ,i_replace_delimiter VARCHAR2) RETURN VARCHAR2     IS
  -----------------------------------------------------------------------
  -- Function Name.......: FIELD
  -- Author..............: Dan Simson
  -- Date................: 05/06/2016 
  -- Description.........: This function is similar to the one I used from 
  --                       long ago by Prime Computer.  You can easily
  --                       parse a delimited string.
  -- Example.............: 
  --  String.............: This is a cool function
  --  Delimiter..........: ' '
  --  Occurance..........: 2
  --  Return Number......: 3
  --  Replace Delimiter..: '/'
  --  Return Value.......: is/a/cool
  --------------------------------------------------------------------------    ---                                    
  v_return_string  VARCHAR2(32767);
  n_start          NUMBER := i_occurance;
  v_delimiter      VARCHAR2(1);
  n_return_number  NUMBER := i_return_number;
  n_max_delimiters NUMBER := regexp_count(i_string, i_delimiter);
BEGIN
  IF i_return_number > n_max_delimiters THEN
    n_return_number := n_max_delimiters + 1;
  END IF;
  FOR a IN 1 .. n_return_number LOOP
    v_return_string := v_return_string || v_delimiter || regexp_substr    (i_string, '[^' || i_delimiter || ']+', 1, n_start);
    n_start         := n_start + 1;
    v_delimiter     := nvl(i_replace_delimiter, i_delimiter);
  END LOOP;
  RETURN(v_return_string);
END field;


SELECT field('This is a cool function',' ',2,3,'/') FROM dual;

SELECT regexp_substr('This is a cool function', '[^ ]+', 1, 1) Word1
      ,regexp_substr('This is a cool function', '[^ ]+', 1, 2) Word2
      ,regexp_substr('This is a cool function', '[^ ]+', 1, 3) Word3
      ,regexp_substr('This is a cool function', '[^ ]+', 1, 4) Word4
      ,regexp_substr('This is a cool function', '[^ ]+', 1, 5) Word5
  FROM dual;

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2020-12-04 01:09
              
            
            
                                                                       
Trying to negate the match string '[[:space:]]-[[:space:]]' by putting it in a character class with a circumflex (^) to negate it will not work.  Everything between a pair of square brackets is treated as a list of optional single characters except for named named character classes which expand out to a list of optional characters, however, due to the way character classes nest, it's very likely that your outer brackets are being interpreted as follows:


[^[[:space:]] A single non space non left square bracket character
- followed by a single hyphen
[[:space:]] followed by a single space character
]+ followed by 1 or more closing square brackets.


It may be easier to convert your multi-character separator to a single character with regexp_replace, then use regex_substr to find you individual pieces:

select regexp_substr(regexp_replace('Helloworld - test!'
                                   ,'[[:space:]]-[[:space:]]'
                                   ,chr(11))
                    ,'([^'||chr(11)||']*)('||chr(11)||'|$)'
                    ,1 -- Start here
                    ,2 -- return 1st, 2nd, 3rd, etc. match
                    ,null
                    ,1 -- return 1st sub exp
                    )
  from dual;


In this code I first changed - to chr(11).  That's the ASCII vertical tab (VT) character which is unlikely to appear in most text strings. Then the match expression of the regexp_substr matches all non VT characters followed by either a VT character or the end of line.  Only the non VT characters are returned (the first subexpression).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复