pyspark Loading multiple partitioned files in a single load

强颜欢笑 提交于 2019-12-02 10:57:22

As explained in the official documentation, to read multiple files, you should pass a list:

path – optional string or a list of string for file-system backed data sources.

So in your case:

(sqlContext.read
    .format('orc') 
    .options(basePath=basePath)
    .load(path=paths))

Argument unpacking (*) would makes sense only if load was defined with variadic arguments, form example:

def load(this, *paths):
    ...
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!