Explode (transpose?) multiple columns in Spark SQL table

前端 未结 2 1542
深忆病人
深忆病人 2020-11-27 04:56

I am using Spark SQL (I mention that it is in Spark in case that affects the SQL syntax - I\'m not familiar enough to be sure yet) and I have a table that I am trying to re-

2条回答
  •  感情败类
    2020-11-27 05:10

    You could also try

    case class Input(
     userId: Integer,
     someString: String,
     varA: Array[Integer],
     varB: Array[Integer])
    
    case class Result(
     userId: Integer,
     someString: String,
     varA: Integer,
     varB: Integer)
    
    def getResult(row : Input) : Iterable[Result] = {
     val user_id = row.user_id
     val someString = row.someString
     val varA = row.varA
     val varB = row.varB
     val seq = for( i <- 0 until varA.size) yield {Result(user_id,someString,varA(i),varB(i))}
     seq
     }
    
    val obj1 = Input(1, "string1", Array(0, 2, 5), Array(1, 2, 9))
    val obj2 = Input(2, "string2", Array(1, 3, 6), Array(2, 3, 10))
    val input_df = sc.parallelize(Seq(obj1, obj2)).toDS
    
    val res = input_df.flatMap{ row => getResult(row) }
    res.show
    // +------+----------+----+-----+
    // |userId|someString|varA|varB |
    // +------+----------+----+-----+
    // |     1|  string1 |   0|   1 |
    // |     1|  string1 |   2|   2 |
    // |     1|  string1 |   5|   9 |
    // |     2|  string2 |   1|   2 |
    // |     2|  string2 |   3|   3 |
    // |     2|  string2 |   6|   10|
    // +------+----------+----+-----+
    

提交回复
热议问题