hortonworks-data-platform

Spark on YARN too less vcores used

匆匆过客 提交于 2019-11-27 03:33:49
问题 I'm using Spark in a YARN cluster (HDP 2.4) with the following settings: 1 Masternode 64 GB RAM (50 GB usable) 24 cores (19 cores usable) 5 Slavenodes 64 GB RAM (50 GB usable) each 24 cores (19 cores usable) each YARN settings memory of all containers (of one host): 50 GB minimum container size = 2 GB maximum container size = 50 GB vcores = 19 minimum #vcores/container = 1 maximum #vcores/container = 19 When I run my spark application with the command spark-submit --num-executors 30 -

Requests hang when using Hiveserver2 Thrift Java client

醉酒当歌 提交于 2019-11-27 02:49:32
问题 This is a follow up question to this question where I ask what the Hiveserver 2 thrift java client API is. This question should be able to stand along without that background if you don't need any more context. Unable to find any documentation on how to use the hiverserver2 thrift api, I put this together. The best reference I could find was the Apache JDBC implementation. TSocket transport = new TSocket("hive.example.com", 10002); transport.setTimeout(999999999); TBinaryProtocol protocol =

Sqoop import : composite primary key and textual primary key

夙愿已清 提交于 2019-11-27 01:48:14
Stack : Installed HDP-2.3.2.0-2950 using Ambari 2.1 The source DB schema is on sql server and it contains several tables which either have primary key as : A varchar Composite - two varchar columns or one varchar + one int column or two int columns. There is a large table with ? rows which has three columns in the PK one int + two varchar columns As per the Sqoop documentation : Sqoop cannot currently split on multi-column indices. If your table has no index column, or has a multi-column key, then you must also manually choose a splitting column. The first question is : What is expected by

Sqoop import : composite primary key and textual primary key

末鹿安然 提交于 2019-11-26 12:29:14
问题 Stack : Installed HDP-2.3.2.0-2950 using Ambari 2.1 The source DB schema is on sql server and it contains several tables which either have primary key as : A varchar Composite - two varchar columns or one varchar + one int column or two int columns. There is a large table with ? rows which has three columns in the PK one int + two varchar columns As per the Sqoop documentation : Sqoop cannot currently split on multi-column indices. If your table has no index column, or has a multi-column key,

ERROR 1066: Unable to open iterator for alias in Pig, Generic solution

妖精的绣舞 提交于 2019-11-25 22:20:58
问题 A very common, error message in Apache Pig is: ERROR 1066: Unable to open iterator for alias There are several questions where this error is mentioned, but none of them give a generic approach for dealing with it. Hence this question: What to do when you get an ERROR 1066: Unable to open iterator for alias ? 回答1: The message "ERROR 1066: Unable to open iterator for alias myAlias" suggests that there is something going wrong in the line where you use myAlias. However, usually you will see this