How Can I Obtain an Element Position in Spark's RDD?

后端未结

关注

 2  1484

时光说笑 2020-12-30 07:12

I am new to Apache Spark, and I know that the core data structure is RDD. Now I am writing some apps which require element positional information. For example, after convert

2条回答

清酒与你 (楼主)

2020-12-30 07:47
I believe in most cases, zipWithIndex() will do the trick, and it will preserve the order. Read the comments again. My understanding is that it exactly means keep the order in the RDD.
```
scala> val r1 = sc.parallelize(List("a", "b", "c", "d", "e", "f", "g"), 3)
scala> val r2 = r1.zipWithIndex
scala> r2.foreach(println)
(c,2)
(d,3)
(e,4)
(f,5)
(g,6)
(a,0)
(b,1)
```
Above example confirm it. The red has 3 partitions, and a with index 0, b with index 1, etc.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...