问题
These are the steps I took:
1) Read in CSV file
rawdata <- read.csv('name of my file', stringsAsFactors=FALSE)
2) Cleaned my data by removing certain records based on x-criteria
data <- rawdata[!(rawdata$YOURID==""), all()]
data <- data[(data$thiscolumn=="right"), all()]
data <- data[(data$thatcolumn=="right"), all()]
3) Now I want to replace certain values throughout the whole matrix with a number (replace a string with a number value). I have tried the following commands and nothing works (I've tried gsub
and replace
):
gsub("Not the right string", 2, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)
data <- replace(data, data$thiscolumn == "Not the right string" , 2)
gsub("\\Not the right string", "2", data$thiscolumn, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)
I am new to R. I normally code in C++. The only other thing for me to try is a for loop. I potentially might only want R to look at certain columns for replace certain values, but I'd prefer a search through the whole matrix. Either is fine.
These are the guidelines per R Help:
sub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)
gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE, fixed = FALSE, useBytes = FALSE)replace(x, list, values)
Arguments
x vector
list an index vector
values replacement values
Example: I want to replace the text "Extremely Relevant 5" or whatever x-text, with a corresponding number value.
回答1:
You can substitute the for
loop by using logical indexing. First you need to identify the indices of what you want to replace, then assign the new value for these indices.
Here's small example. Let's say we have this vector:
x <- c(1, 2, 99, 4, 2, 99)
# x
# [1] 1 2 99 4 2 99
And we want to find all places where it's 99 and replace it with 0. when you apply x == 99
you get a TRUE
and FALSE
vector.
x == 99
# [1] FALSE FALSE TRUE FALSE FALSE TRUE
You can use this vector as an index to assign the new value where the condition is met.
x[x == 99] <- 0
# x
# [1] 1 2 0 4 2 0
Similarly you can use this approach to apply it across a dataframe or a matrix in a one-shot
df <- data.frame(col1 = c(2, 99, 3), col2 = c(99, 4, 99))
# df:
# col1 col2
# 1 2 99
# 2 99 4
# 3 3 99
df[df==99] <- 0
# df
# col1 col2
# 1 2 0
# 2 0 4
# 3 3 0
For dataframe with strings, it might be trickier since the column can be factor and the value you're trying to replace is not one of the levels. You can go around that by changing it to character
and apply the replacement.
> df <- data.frame(col1 = c(2, "this string", 3), col2 = c("this string", 4, "this string"))
> df
col1 col2
1 2 this string
2 this string 4
3 3 this string
> sapply(df, class)
col1 col2
"factor" "factor"
> df <- sapply(df, as.character)
> df
col1 col2
[1,] "2" "this string"
[2,] "this string" "4"
[3,] "3" "this string"
> df[df == "this string"] <- 0
> df <- as.data.frame(df)
> df
col1 col2
1 2 0
2 0 4
3 3 0
回答2:
I have found a few solutions to my own questions I thought I'd share in just working a little more out just now.
1) I had to add the package "library(stringr)" at the top so that R can understand matching strings.
2) I used a for loop to go down the entries of a specific column I wanted in my Matrix to change to the value indicated. See as follows:
`#possible solution 5 - This totally works!
for (i in 1:nrow(data)){
if (data$columnofinterest[i] == "String of Interest")
data$columnofinterest[i] <- "Becca is da bomb dot com"
}`
`#possible solution 6 - This totally works!
for (i in 1:nrow(data)){
if (data$columnofinterest[i] == "Becca is da bomb dot com")
data$columnofinterest[i] <- 7
}`
As you can see replacing specific records between text and a numerical value is possible (text to numerical value and vice versa). And as the comments indicate it took me till the 5 and 6 problem solution to figure this much out. Still not the whole Matrix, but at least I can go through column of interest at a time, which is still a lot faster.`
回答3:
Here's a dplyr
/tidyverse
solution adapted from changing multiple column values given a condition in dplyr. You can use mutate_all
:
library(tidyverse)
data <- tibble(a = c("don't change", "change", "don't change"),
b = c("change", "Change", "don't change"))
data %>%
mutate_all(funs(if_else(. == "change", "xxx", .)))
来源:https://stackoverflow.com/questions/51314482/how-do-i-replace-values-in-a-matrix-from-an-uploaded-csv-file-in-r