I\'m new to Spark, and I found the Documentation says Spark will will load data into memory to make the iteration algorithms faster.
But what if I have a log file of
It will not load the full 10G, as you don't have enough memory available. From my experience, one of three will happen depending on how you use your data:
If you are trying to cache the 10GBs:
If you are just processing the data:
Of course, this is highly related to your code and the transformations you are applying.