Spark cartesian product
问题 I have to compare coordinates in order to get the distance. Therefor i load the data with sc.textFile() and make a cartesian product. There are about 2.000.000 lines in the textfile thus 2.000.000 x 2.000.000 to be compared coordinates. I tested the code with about 2.000 coordinates and it worked fine within seconds. But using the big file it seems to stop at a certain point and i don't know why. The code looks as follows: def concat(x,y): if(isinstance(y, list)&(isinstance(x,list))): return