问题
I have a dataset with 100000 rows where order_date shows the order date and user_id where shows the user's ID. I am trying to create a new variable that shows the user's total order within the same day. My data is like this:
order_date=structure(c(15587, 15647, 15734, 15560, 15599, 15778, 15708,
15520, 15592, 15447, 15718, 15787, 15519, 15486, 15514, 15784,
15619, 15705, 15552, 15734, 15493, 15661, 15563, 15600, 15790,
15485, 15546, 15767, 15704, 15726), class = "Date")
user_id=c(22607, 28275, 32238, 20202, 4391, 7983, 29590, 11820, 22956,
3196, 31125, 11709, 6586, 2920, 9698, 36814, 6954, 30368, 19052,
827, 6599, 517, 8761, 20174, 37367, 11647, 18764, 27271, 30302,
14808)
daten = data.frame(order_date = order_date, user_id = user_id)
I am using this code:
daten<-join(daten, count(daten, c("order_date", "user_id")))
It creates a new variable called "freq" and it was working until today. Now it doesn't work and I am getting an error message like this:
Error in mutate_impl(.data, dots) :
Column c("order_date", "user_id")
must be length 100000 (the number of rows) or one, not 2
I checked the structure of both variables using str
and it says both have 100000 rows.
回答1:
I'm not sure which join
(inner_join
) you intend to use but one thing certainly not correct in your code is about count
.
count(daten, c("order_date", "user_id"))
should be changed to:
count(daten, order_date, user_id)
回答2:
I run into the same error message with passing string arguments to group_by
function a vector of string variables as an argument. Thus, also following clarifications by @MKR, I'll add the solution to my problem, that also seems to solve the problem of the initial question:
daten %>%
group_by_at(vars(one_of(c("order_date", "user_id")))) %>%
summarise(n = n())
With the original data, it doesn't make much sense (as all entries are unique in both columns), but in other cases, this might be useful
来源:https://stackoverflow.com/questions/48357401/error-in-mutate-impl-data-dots-using-join-code