Apache Spark: dealing with Option/Some/None in RDDs
问题 I'm mapping over an HBase table, generating one RDD element per HBase row. However, sometimes the row has bad data (throwing a NullPointerException in the parsing code), in which case I just want to skip it. I have my initial mapper return an Option to indicate that it returns 0 or 1 elements, then filter for Some , then get the contained value: // myRDD is RDD[(ImmutableBytesWritable, Result)] val output = myRDD. map( tuple => getData(tuple._2) ). filter( {case Some(y) => true; case None =>