I am a teacher, and would like to correctly use the data.table
package in R
to automatically grade student answers in a log file, i.e. add a column
You can use a join:
# initialize to zero
log[, correct := 0L ]
# update to 1 if matched
log[question_table, on=c(question_id = "id", student_answer = "correct_ans"),
correct := 1L ]
student question_id student_answer correct
1: b 1 2 1
2: c 1 4 1
3: b 1 1 0
4: b 2 3 0
5: c 2 2 1
6: b 2 4 1
7: c 3 4 1
8: b 3 5 0
9: a 4 2 0
10: c 4 1 1
How it works. The syntax for an update join is X[Y, on=cols, xvar := z]
:
X
and Y
, use on=c(xcol = "ycol", xcol2 = "ycol2")
or, in version 1.9.7+, .(xcol = ycol, xcol2 = ycol2)
.xvar := z
will only operate on the rows of X
that are matched. Sometimes, it is also useful to use by=.EACHI
here, depending on how many rows of X
are matched by each in Y
and how complicated the expression for z
is.See ?data.table
for full documentation on the syntax.