问题
I needs to create a qqplot using facets by row and column. I understand how to facet plot with columns and rows, but I am not sure how to set up my data. Ultimately, I want to group my dataset by column and row, then sort the 'Modeled' results and 'Observed' results in ascending order while adding a column with the 'row' group and a column with the 'column' group.
I have been trying to modify the solution to this question, Faceted qqplots with ggplot2, , but I am not very familiar with lapply so maybe I just missed something.
Here is the code I have been working with:
#Dummy Data:
df <- mtcars
# Name columns as I have in my real data
df$rows <- df$cyl
df$columns <- df$gear
df$Modeled <- df$wt
df$Observed <- df$mpg
# Function to sort data while maintaining the rows & columns for use in facet later.
dat_sort <- do.call("rbind",
sapply(list(unique(df$rows), unique(df$columns)),
FUN = function(x) {
data.frame(rows = x[[1]],
columns = x[[2]],
Observed = sort(df$Observed[df$rows == x[[1]] & df$columns == x[[2]]]),
Modeled = sort(df$Modeled[df$rows == x[[1]] & df$columns == x[[2]]])
)
}
))
I don't get an error, but my output is definitely not what I was expecting. My output should look like this: (with correct column names)
rows columns Observed Modeled
6 4 17.8 2.620
6 4 19.2 2.875
6 4 21.0 3.440
6 4 21.0 3.440
4 3 21.5 2.465
8 5 15.0 3.17
8 5 15.8 3.57
Output from code:
[,1] [,2] [,3] [,4]
[1,] 6.000 6.000 6.000 6.000
[2,] 4.000 4.000 4.000 4.000
[3,] 17.800 19.200 21.000 21.000
[4,] 2.620 2.875 3.440 3.440
[5,] 4.000 4.000 4.000 4.000
[6,] 3.000 3.000 3.000 3.000
[7,] 21.500 21.500 21.500 21.500
[8,] 2.465 2.465 2.465 2.465
Any help would be most appreciated!
Thanks!
回答1:
We just need to order the dataset in ascending order and that can be done with order
df1 <- df[c('rows', 'columns', 'Observed', 'Modeled')]
df2 <- df1[do.call(order, -df1),]
In the OP's code, change the sapply to lapply to return as list (sapply by default uses simplify = TRUE to return a matrix)
do.call(rbind, lapply(list(unique(df$rows),
unique(df$columns)),
FUN = function(x) {
data.frame(rows = x[[1]],
columns = x[[2]],
Observed = sort(df$Observed[df$rows == x[[1]] &
df$columns == x[[2]]]),
Modeled = sort(df$Modeled[df$rows == x[[1]] &
df$columns == x[[2]]])
)
}
))
# rows columns Observed Modeled
#1 6 4 17.8 2.620
#2 6 4 19.2 2.875
#3 6 4 21.0 3.440
#4 6 4 21.0 3.440
#5 4 3 21.5 2.465
By doing the looping on the unique as a list, each vector is a separate list element and x[[1]], x[[2]] are actually subsetting based on the 6, 4 and 4, 3 for the second element
list(unique(df$rows), unique(df$columns))
#[[1]]
#[1] 6 4 8
#[[2]]
#[1] 4 3 5
Instead, if we need to do this for corresponding elements, then use Map or loop over the sequence of unique elements (assuming they have the same length), but a more easier approach is split
If the lengths are same for unique elements, and want to subset based on corresponding values, then use Map
do.call(rbind, Map(function(x, y) {
i1 <- df$rows == x & df$columns == y
data.frame(rows = x, columns = y,
Observed = sort(df$Observed[i1]),
Modeled = sort(df$Modeled[i1]))},
unique(df$rows), unique(df$columns)))
# rows columns Observed Modeled
#1 6 4 17.8 2.620
#2 6 4 19.2 2.875
#3 6 4 21.0 3.440
#4 6 4 21.0 3.440
#5 4 3 21.5 2.465
#6 8 5 15.0 3.170
#7 8 5 15.8 3.570
回答2:
transpose to matrix:
t(dat_sort)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 6 4 17.8 2.620 4 3 21.5 2.465
[2,] 6 4 19.2 2.875 4 3 21.5 2.465
[3,] 6 4 21.0 3.440 4 3 21.5 2.465
[4,] 6 4 21.0 3.440 4 3 21.5 2.465
来源:https://stackoverflow.com/questions/61963801/r-faceted-qqplots-with-column-and-row