R+Hadoop: How to read CSV file from HDFS and execute mapreduce?

孤街浪徒 提交于 2019-12-05 07:35:58

mapreduce(input = path, input.format = make.input.format(...), map ...)

from.dfs is for small data. In most cases you won't use from.dfs in the map function. The arguments hold a portion of the input data already

somnathchakrabarti

You can do something like below:

r.file <- hdfs.file(hdfsFilePath,"r")
from.dfs(
    mapreduce(
         input = as.matrix(hdfs.read.text.file(r.file)),
         input.format = "csv",
         map = ...
))

Please give points and hope anybody find it useful.

Note: For details refer to the stackoverflow post :

How to input HDFS file into R mapreduce for processing and get the result into HDFS file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!