Unable to get parquet-tools working from the command-line

不羁岁月 提交于 2019-12-04 09:05:57

On MacOS using homebrew, this is the easiest way to get started:

$ brew install parquet-tools

You can also include hadoop dependencies into the target jar:

mvn clean package -Plocal -DskipTests -Dhadoop.scope=compile

If you have hadoop installed, change your command to be hadoop jar parquet-tools-1.7.0-incubating-SNAPSHOT.jar meta --debug part-r-00000.gz.parquet instead.

This set of steps from the parquet-mr issues list fixed the same issue for me:

mvn install
cd parquet-tools
mvn clean package -Plocal
mvn install
mvn dependency:copy-dependencies
# replace 1.8.2 in the next step with the version you're using
cp target/parquet-tools-1.8.2-SNAPSHOT.jar target/dependency/
mkdir -p ~/local/bin/lib
cp target/dependency/* ~/local/bin/lib/
cp src/main/scripts/* ~/local/bin/
echo export PATH=$PATH:~/local/bin >> .profile

I ran into a similar issue and fixed it by specifying the "local" profile:

mvn clean package -Plocal

I had originally missed this paragraph, but it's explained that if you want to mix in Hadoop dependencies, the "local" profile does so, as opposed to the default where you're expected to use it somewhere Hadoop is already installed and present on your classpath:

https://github.com/Parquet/parquet-mr/tree/master/parquet-tools

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!