Pyspark Split Columns

后端未结

关注

 1  1935

from pyspark.sql import Row, functions as F
row = Row(\"UK_1\",\"UK_2\",\"Date\",\"Cat\",\'Combined\')
agg = \'\'
agg = \'Cat\'
tdf = (sc.parallelize
    ([


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  遇见更好的自我        
                
              
                            
                2020-12-19 13:38
              
            
            
                                                                       
The pattern is a regular expression, see split; and ^ is an anchor that matches the beginning of string in regex, to match literally, you need to escape it:

cols = F.split(tdf['Combined'], r'\^')
tdf = tdf.withColumn('column1', cols.getItem(0))
tdf = tdf.withColumn('column2', cols.getItem(1))
tdf.show(truncate = False)

+----+----+------------+---+-------------+-------+-------+
|UK_1|UK_2|Date        |Cat|Combined     |column1|column2|
+----+----+------------+---+-------------+-------+-------+
|1   |1   |12/10/2016  |A  |Water^World  |Water  |World  |
|1   |2   |null        |A  |Sea^Born     |Sea    |Born   |
|2   |1   |14/10/2016  |B  |Germ^Any     |Germ   |Any    |
|3   |3   |!~2016/2/276|B  |Fin^Land     |Fin    |Land   |
|null|1   |26/09/2016  |A  |South^Korea  |South  |Korea  |
|1   |1   |12/10/2016  |A  |North^America|North  |America|
|1   |2   |null        |A  |South^America|South  |America|
|2   |1   |14/10/2016  |B  |New^Zealand  |New    |Zealand|
|null|null|!~2016/2/276|B  |South^Africa |South  |Africa |
|null|1   |26/09/2016  |A  |Saudi^Arabia |Saudi  |Arabia |
+----+----+------------+---+-------------+-------+-------+

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复