I'm having a problem using the python library simplejson in jython to write a Pig UDF. I need because jython-standalone-2.5.2.jar doesn't come with a JSON library. I'm using Apache Pig version 0.11.0-cdh4.4.0 (rexported) compiled Sep 03 2013, 20:25:46, and according to the documentation "You can import Python modules in your Python script. Pig resolves Python dependencies recursively, which means Pig will automatically ship all dependent Python modules to the backend. Python modules should be found in the jython search path: JYTHON_HOME, JYTHON_PATH, or current directory.". So I download the library from, unzip it in my working directory and then my script works in local mode (with -x local). Nevertheless in cluster mode I get this error in the failed logs of the task tracker:

I've tried several things, like zipping simplejson and registering the zip and trying to access it with sys.path.append(''), I've also tried with:

export JYTHONPATH=$JYTHONPATH:$(pwd)/; pig script.pig

and also

pig -Dmapred.cache.files="" -Dmapred.create.symlink=yes


I don't know if my answer come too late but I managed to import simplejson in an UDF.

Here is how I did it :

I downloaded simplejson and put it into a lib folder, then in my UDF I did this :

import sys
import simplejson as json

I then managed to do a json.loads() without any problem on my cluster.

