Using Hadoop to run a jar file - Python
问题 I have an existing Python program that has a sequence of operations that goes something like this: Connect to MySQL DB and retrieve files into local FS. Run a program X that operates on these files. Something like: java -jar X.jar <folder_name> This will open every file in the folder and perform some operation on them and writes out an equal number of transformed files into another folder. Then, run a program Y that operates on these files as: java -jar Y.jar <folder_name> This creates