Datastax Cassandra PIG Running only one MAP

我只是一个虾纸丫 提交于 2019-12-11 07:25:17

问题


I am using Datastax Cassandra 3.1.4 with two nodes. I am running pig with CqlStorage() with 12million rows in the table, but I find there is only one map running for a simple pig command.

I tried changing split_size in my pig relation but it didn't worked.

Here is my sample query.

x = load'cql://Mykeyspace/MyCF?split_size=1000' using CqlStorage();
y = limit x 500;
dump y

I didn't find input.split.size property in my mapred-site.xml I am assuming default split size is 64*1024

I tried set pig.splitCombination false;

Now its taking 513 maps for any no.of records, I tried same thing from Hive

I have connected to Cassandra from Hive and gave a simple select all query with where col1>value this table have only 10 records but still this is running 513 maps.

Please help me on this

Thanks


回答1:


Try this setting:

set pig.splitCombination false;

By default, pig will combine what it considers small splits into a single map.



来源:https://stackoverflow.com/questions/22911149/datastax-cassandra-pig-running-only-one-map

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!