How to remove duplicates from comma separated list by regexp_replace in Oracle?

前端未结

关注

 2  1745

长发绾君心 2020-12-11 14:07

I have

 POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE

I want

POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
<


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   悲哀的现实
                                             
                
                
                (楼主)
            
              
              
                2020-12-11 15:11
              

            
            
                        
The solution offered below uses straight SQL (no PL/SQL). It works with any possible input string, and it removes duplicates in place - it keeps the order of input tokens, whatever that order is. It also removes consecutive commas (it "deletes nulls" from the input string) while treating null inputs correctly. Notice the output for an input string consisting of commas only, and the correct treatment of "tokens" consisting of two spaces and one space respectively.

The query runs relatively slowly; if performance is an issue, it can be re-written as a recursive query, using "traditional" substr and instr which are quite a bit faster than regular expressions.

with inputs (input_string) as (
       select 'POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE' from dual
       union all
       select null from dual
       union all
       select 'ab,ab,st,ab,st,  , ,  ,x,,,r' from dual
       union all
       select ',,,' from dual
     ),
     tokens (input_string, rk, token) as (
       select     input_string, level, 
                  regexp_substr(input_string, '([^,]+)', 1, level, null, 1)
       from       inputs 
       connect by level <= 1 + regexp_count(input_string, ',')
     ),
     distinct_tokens (input_string, rk, token) as (
       select     input_string, min(rk) as rk, token
       from       tokens
       group by   input_string, token
     )
select   input_string, listagg(token, ',') within group (order by rk) output_string
from     distinct_tokens
group by input_string
;


Results for the inputs I created:

INPUT_STRING                                                       OUTPUT_STRING
------------------------------------------------------------------ ----------------------------------------
,,,                                                                (null)
POW,POW,POWPRO,PRO,PRO,PROUTL,TNEUTL,TNEUTL,UTL,UTLTNE,UTL,UTLTNE  POW,POWPRO,PRO,PROUTL,TNEUTL,UTL,UTLTNE
ab,ab,st,ab,st,  , ,  ,x,,,r                                       ab,st,  , ,x,r
(null)                                                             (null)


4 rows selected.

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复