Query performance in PostgreSQL using 'similar to'

前端未结

关注

 4  1048

伪装坚强ぢ 2021-01-20 14:00

I need to retrieve certain rows from a table depending on certain values in a specific column, named columnX in the example:

select *
from t


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   日久生厌
                                             
                
                
                (楼主)
            
              
              
                2021-01-20 14:17
              

            
            
                        
This strikes me as a data modelling issue. You appear to be using a text field as a set, storing single character codes to identify values present in the set.

If so, I'd want to remodel this table to use one of the following approaches:


Standard relational normalization. Drop columnX, and replace it with a new table with a foreign key reference to tableName(id) and a charcode column that contains one character from the old columnX per row, like CREATE TABLE tablename_columnx_set(tablename_id integer not null references tablename(id), charcode "char", primary key (tablename_id, charcode)). You can then fairly efficiently search for keys in columnX using normal SQL subqueries, joins, etc. If your application can't cope with that change you could always keep columnX and maintain the side table using triggers.
Convert columnX to a hstore of keys with a dummy value. You can then use hstore operators like columnX ?| ARRAY['A','B','C']. A GiST index on the hstore of columnX should provide fairly solid performance for those operations.
Split to an array as recommended by Quassnoi if your table change rate is low and you can pay the costs of the GIN index;
Convert columnX to an array of integers, use intarray and the intarray GiST index. Have a mapping table of codes to integers or convert in the application.


Time permitting I'll follow up with demos of each. Making up the dummy data is a pain, so it'll depend on what else is going on.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复