问题
Given an input rdd or form
1: 6 7
2: 5
How can i get another rdd of form
1 6
1 7
2 5
and so on..
fails with message unicode item does not have attribute flatMap
def get_str(x,y):
..code to flatmap
return op
text = sc.textFile(inputs)
res = text.map(lambda l:l.split(":")).map(lambda (x,y):get_str(x,y))
回答1:
I'm not really into Python, but it looks like you're trying to use flatMap
inside your map
, but rather you need to replace your map
with flatMap
. In Scala, I would do:
val text = sc.textFile(inputs)
val res = text.map(l => l.split("[\\s:]+"))
.flatMap(list => list.drop(1).map(i => (list(0), i)))
Note that I split on both " "
and ":"
to get a list of values.
The same thing in Python:
def to_seq(s):
k, vs = s.split(":")
for v in vs.split():
yield k, v
text = sc.parallelize(["1: 6 7", "2: 5"])
res = text.flatMap(to_seq)
res.take(3)
## [('1', '6'), ('1', '7'), ('2', '5')]
来源:https://stackoverflow.com/questions/33540559/flatmap-throws-error-unicode-item-does-not-have-attribute-flatmap