问题
I'm trying to measure the max size of variable I can broadcast using spark broadcast.
I didn't find any explanation regarding this issue.
did someone measure it? does spark has configuration for broadcast size?
回答1:
Limit for broadcasting has now been increased to 8 GB. you can find the details here.
回答2:
It's currently ~2GB. Anything you broadcast is converted into java byte array during serialization and as java arrays have max size Integer.MAX_VALUE you get this limit. There may currently be some effort increasing this limit: SPARK-6235
来源:https://stackoverflow.com/questions/39226163/evaluate-the-max-size-for-a-spark-broadcast-variable