I am trying include a python package (NLTK) with a Hadoop streaming job, but am not sure how to do this without including every file manually via the CLI argument, \"-file\"
I would zip up the package into a .tar.gz or a .zip and pass the entire tarball or archive in a -file option to your hadoop command. I've done this in the past with Perl but not Python.
That said, I would think this would still work for you if you use Python's zipimport at http://docs.python.org/library/zipimport.html, which allows you to import modules directly from a zip.