I written an R package, and now a collaborator (recent CS grad who is new to R) is editing and refactoring the code. In the process, he is dividing up my
I cut up my functions under two conditions :
I do include the so-called helper functions in the file of the main function, but only as long as those helper functions are not used in any other function. Actually, I consider them nested within the main function. I do understand your argument for version control, but changing the helper function comes down to changing the performance of the main function, so I see no problem in keeping them in the same file.
Some helper functions might be used in different other functions, and then I save them in their own file. Often I do export those functions, as they might be of interest for the user. Compare this to eg lm
and the underlying lm.fit
, where advanced users could make decent use of lm.fit
for speeding up code etc.
I use the naming convention used in quite some packages (and derived from linux), by preceding every "hidden" function by a dot. So that makes
.helper.function <- function(x, ...){
... some code ...
}
main.function <- function(x, ...){
...some code, including .helper.function(y, ...)
}
I explicitly @export all functions that need exporting, never the helper functions. It's not always easy to judge whether a function might be of interest to an end user, but in most cases it's pretty clear.
To give an example : A few lines of code to cut off NA lines I consider a helper function. A more complex function to convert the dataset to the correct format I export and document.
YMMV