问题
While searching python client for Hadoop, I found two modules pydoop and hadoopy. It seems both are good enough to work with, but not sure which one has more advantages than the other to install one.
回答1:
The most comprehensive documentation of this I think is http://blog.cloudera.com/blog/2013/01/a-guide-to-python-frameworks-for-hadoop/
Recently, I really think that mrjob has come out ahead as a clear frontrunner. It has a very active mailing list and it seems to be relatively stable and up to date. It also has nice integration with Amazon EMR.
来源:https://stackoverflow.com/questions/21754728/pydoop-vs-hadoopy-hadoop-python-client