Spark ML Pipeline Causes java.lang.Exception: failed to compile … Code … grows beyond 64 KB

不打扰是莪最后的温柔 提交于 2019-12-01 17:53:27

It seems like there may be a limit, namely 64KB, to Spark's lazy evaluation. In other words, it's being a little to lazy in this case which is causing it to hit that limit.

I was able to work around this same exception, which I was triggering via a join rather than with a VectorAssembler, by calling cache on one of my Datasets about half way through my pipeline. I don't know (yet) exactly why this solved the issue however.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!