Matrix Transpose on RowMatrix in Spark

前端 未结 6 1223
梦谈多话
梦谈多话 2020-12-16 15:20

Suppose I have a RowMatrix.

  1. How can I transpose it. The API documentation does not seem to have a transpose method.
  2. The Matrix has the transpose() me
6条回答
  •  盖世英雄少女心
    2020-12-16 16:05

    If anybody interested, I've implemented the distributed version @javadba had proposed.

      def transposeRowMatrix(m: RowMatrix): RowMatrix = {
        val transposedRowsRDD = m.rows.zipWithIndex.map{case (row, rowIndex) => rowToTransposedTriplet(row, rowIndex)}
          .flatMap(x => x) // now we have triplets (newRowIndex, (newColIndex, value))
          .groupByKey
          .sortByKey().map(_._2) // sort rows and remove row indexes
          .map(buildRow) // restore order of elements in each row and remove column indexes
        new RowMatrix(transposedRowsRDD)
      }
    
    
      def rowToTransposedTriplet(row: Vector, rowIndex: Long): Array[(Long, (Long, Double))] = {
        val indexedRow = row.toArray.zipWithIndex
        indexedRow.map{case (value, colIndex) => (colIndex.toLong, (rowIndex, value))}
      }
    
      def buildRow(rowWithIndexes: Iterable[(Long, Double)]): Vector = {
        val resArr = new Array[Double](rowWithIndexes.size)
        rowWithIndexes.foreach{case (index, value) =>
            resArr(index.toInt) = value
        }
        Vectors.dense(resArr)
      } 
    

提交回复
热议问题