rhadoop

How to install RHadoop packages (Rmr, Rhdfs, Rhbase)?

时光怂恿深爱的人放手 提交于 2020-01-10 02:02:19
问题 Actually I am trying my level best to integrate with R, but I got this error. packages ‘rmr’, ‘rJava‘, ‘RJSONIO‘, ‘rhdfs’, ‘rhbase’, ‘plyrmr’ are not available (for R version 3.1.3) Steps to integrate Hadoop with R: Installed R, and Hadoop in ubuntu. Add these three lines in ~/.bashrc file. *export HADOOP_PREFIX=/Users/hadoop/hadoop-1.1.2 export HADOOP_CMD=/Users/hadoop/hadoop-1.1.2/bin/hadoop export HADOOP_STREAMING=/Users/hadoop/hadoop-1.1.2/contrib/streaming/hadoop-streaming-1.1.2.jar*

Error in as(x, class(k)) : no method or default for coercing “NULL” to “data.frame”

半世苍凉 提交于 2020-01-03 04:33:10
问题 I am currently facing an error mentioned below which is related to NULL values being coerced to a data frame. The data set does contain nulls, however I have tried both is.na() and is.null() functions to replace the null values with something else. The data is stored on hdfs and is stored in a pig.hive format. I have also attached the code below. The code works fine if I remove v[,25] from the key. Code: AM = c("AN"); UK = c("PP"); sample.map <- function(k,v){ key <- data.frame(acc = v[!which

hadoop streaming failed with error code 5

安稳与你 提交于 2019-12-25 04:42:20
问题 RHadoop program for wordcount: Sys.setenv(HADOOP_CMD="/usr/local/hadoop/bin/hadoop") Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.4.1.jar") Sys.setenv(HADOOP_HOME="/usr/local/hadoop") library(rmr2) ## map function map <- function(k,lines) { words.list <- strsplit(lines, '\\s') words <- unlist(words.list) return( keyval(words, 1) ) } ## reduce function reduce <- function(word, counts) { keyval(word, sum(counts)) } wordcount <- function (input, output

R+Hadoop: How to read CSV file from HDFS and execute mapreduce?

↘锁芯ラ 提交于 2019-12-22 05:03:12
问题 In the following example: small.ints = to.dfs(1:1000) mapreduce( input = small.ints, map = function(k, v) cbind(v, v^2)) The data input for mapreduce function is an object named small.ints which refered to blocks in HDFS. Now I have a CSV file already stored in HDFS as "hdfs://172.16.1.58:8020/tmp/test_short.csv" How to get an object for it? And as far as I know(which may be wrong), if I want data from CSV file as input for mapreduce, I have to first generate a table in R which contains all

Streaming Command Failed! in RHADOOP

安稳与你 提交于 2019-12-19 18:48:21
问题 I have installed RHADOOP in Hortonwork VM. when I am running mapreduce code to verify it is throwing an error saying I am using user as :rstudio (not root.but has access to sudoer) Streaming Command Failed! Can anybody help me understanding the issue.I am not getting much idea to solve thios issue. Sys.setenv(HADOOP_HOME="/usr/hdp/2.2.0.0-2041/hadoop") Sys.setenv(HADOOP_CMD="/usr/bin/hadoop") Sys.setenv(HADOOP_STREAMING="/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-streaming.jar") library

Rhadoop - wordcount using rmr

元气小坏坏 提交于 2019-12-19 12:04:23
问题 I am trying to run a simple rmr job using Rhadoop package but it is not working.Here is my R script print("Initializing variable.....") Sys.setenv(HADOOP_HOME="/usr/hdp/2.2.4.2-2/hadoop") Sys.setenv(HADOOP_CMD="/usr/hdp/2.2.4.2-2/hadoop/bin/hadoop") print("Invoking functions.......") #Referece taken from Revolution Analytics wordcount = function( input, output = NULL, pattern = " ") { mapreduce( input = input , output = output, input.format = "text", map = wc.map, reduce = wc.reduce, combine

Having issues with RHADOOP?

一世执手 提交于 2019-12-12 03:58:13
问题 I have checked the question : Rhadoop - wordcount using rmr and have tried the answer on my side. But it is giving a lot of issues. Here is the code: Sys.setenv("HADOOP_CMD"="/usr/local/hadoop/bin/hadoop") Sys.setenv("HADOOP_STREAMING"="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.4.0.jar") # load librarys library(rmr2) library(rhdfs) # initiate rhdfs package hdfs.init() map <- function(k,lines) { words.list <- strsplit(lines, '\\s') words <- unlist(words.list) return( keyval

RHive not working with CDH4

落爺英雄遲暮 提交于 2019-12-11 06:46:05
问题 Has anyone tried to make RHive work with cdh4? Is it compatible with cdh4? I have tried asking this question on their google group but no answers yet! I have installed R, RHadoop and all related packages on cdh4 but I am stuck at RHive. Using cdh4 for all environment variables, rhive.connect() gives me the following error - WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS Error in .jfindClass(as.character(class)) : class not found Any ideas/suggestions? Thanks

R+Hadoop: How to read CSV file from HDFS and execute mapreduce?

孤街浪徒 提交于 2019-12-05 07:35:58
In the following example: small.ints = to.dfs(1:1000) mapreduce( input = small.ints, map = function(k, v) cbind(v, v^2)) The data input for mapreduce function is an object named small.ints which refered to blocks in HDFS. Now I have a CSV file already stored in HDFS as "hdfs://172.16.1.58:8020/tmp/test_short.csv" How to get an object for it? And as far as I know(which may be wrong), if I want data from CSV file as input for mapreduce, I have to first generate a table in R which contains all values in the CSV file. I do have method like: data=from.dfs("hdfs://172.16.1.58:8020/tmp/test_short.csv

Failed to remotely execute R script which loads library “rhdfs”

南笙酒味 提交于 2019-12-04 05:11:01
问题 I'm working on a project using R-Hadoop, and got this problem. I'm using JSch in JAVA to ssh to remote hadoop pseudo-cluster, and here are part of Java code to create connection. /* Create a connection instance */ Connection conn = new Connection(hostname); /* Now connect */ conn.connect(); /* Authenticate */ boolean isAuthenticated = conn.authenticateWithPassword(username, password); if (isAuthenticated == false) throw new IOException("Authentication failed."); /* Create a session */ Session