Spark - How to write a single csv file WITHOUT folder?

前端未结
关注
 9  1206
北恋 2020-12-28 13:44
Suppose that df is a dataframe in Spark. The way to write df into a single CSV file is
df.coalesce(1).write.option(\"header\", \"tru

      
      
        
          9条回答        

        
                    
            
            
                         
                
              
              
                
                   萌比男神i
                                             
                
                
                (楼主)
            
              
              
                2020-12-28 14:06
              

            
            
                        
There is no dataframe spark API which writes/creates a single file instead of directory as a result of write operation.

Below both options will create one single file inside directory along with standard files (_SUCCESS , _committed , _started).

 1. df.coalesce(1).write.mode("overwrite").format("com.databricks.spark.csv").option("header",
    "true").csv("PATH/FOLDER_NAME/x.csv")  



2. df.repartition(1).write.mode("overwrite").format("com.databricks.spark.csv").option("header",
        "true").csv("PATH/FOLDER_NAME/x.csv")


If you don't use coalesce(1) or repartition(1) and take advantage of sparks parallelism for writing files then it will create multiple data files inside directory.

You need to write function in driver which will combine all data file parts to single file(cat part-00000* singlefilename ) once write operation is done.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它9个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复