apache-spark-sql

Round double values and cast as integers

老子叫甜甜 提交于 2020-11-28 01:56:39
问题 I have a data frame in PySpark like below. import pyspark.sql.functions as func df = sqlContext.createDataFrame( [(0.0, 0.2, 3.45631), (0.4, 1.4, 2.82945), (0.5, 1.9, 7.76261), (0.6, 0.9, 2.76790), (1.2, 1.0, 9.87984)], ["col1", "col2", "col3"]) df.show() +----+----+-------+ |col1|col2| col3| +----+----+-------+ | 0.0| 0.2|3.45631| | 0.4| 1.4|2.82945| | 0.5| 1.9|7.76261| | 0.6| 0.9| 2.7679| | 1.2| 1.0|9.87984| +----+----+-------+ # round 'col3' in a new column: df2 = df.withColumn("col4",

Round double values and cast as integers

|▌冷眼眸甩不掉的悲伤 提交于 2020-11-28 01:56:32
问题 I have a data frame in PySpark like below. import pyspark.sql.functions as func df = sqlContext.createDataFrame( [(0.0, 0.2, 3.45631), (0.4, 1.4, 2.82945), (0.5, 1.9, 7.76261), (0.6, 0.9, 2.76790), (1.2, 1.0, 9.87984)], ["col1", "col2", "col3"]) df.show() +----+----+-------+ |col1|col2| col3| +----+----+-------+ | 0.0| 0.2|3.45631| | 0.4| 1.4|2.82945| | 0.5| 1.9|7.76261| | 0.6| 0.9| 2.7679| | 1.2| 1.0|9.87984| +----+----+-------+ # round 'col3' in a new column: df2 = df.withColumn("col4",