I would like a pure R way to test whether two arbitrary files are different. So, the equivalent to diff -q in Unix, but should work on Windows and without exter
I realize this is not exactly what you're asking for, but I post it for the benefit of others who run into this question wanting to see the full diff and willing to tolerate external dependencies. In that case, diffobj will show them to you with a real diff that works on windows, with the same algorithm as GNU diff. In this example, we compare the Moby Dick text to a version of it with 5 lines modified:
library(diffobj)
diffFile(mob.1.txt, mob.2.txt) # or `diffChr` if you data in R already
Produces:
If you want something faster while still getting the locations of the differences you can get the shortest edit script, from the same package:
ses(readLines(mob.1.txt), readLines(mob.2.txt))
# [1] "1127c1127" "2435c2435" "6417c6417" "13919c13919"
Code to get the Moby Dick data (note I didn't set seed, so you'll get different lines):
moby.dick.url <- 'http://www.gutenberg.org/files/2701/2701-0.txt'
moby.dick.raw <- moby.dick.UC <- readLines(moby.dick.url)
to.UC <- sample(length(moby.dick.raw), 5)
moby.dick.UC[to.UC] <- toupper(moby.dick.UC[to.UC])
mob.1.txt <- tempfile()
mob.2.txt <- tempfile()
writeLines(moby.dick.raw, mob.1.txt)
writeLines(moby.dick.UC, mob.2.txt)