cloudera-cdh

Class com.hadoop.compression.lzo.LzoCodec not found for Spark on CDH 5?

泪湿孤枕 提交于 2019-11-26 20:41:11
问题 I have been working on this problem for two days and still have not find the way. Problem : Our Spark installed via newest CDH 5 always complains about the lost of LzoCodec class, even after I install the HADOOP_LZO through Parcels in cloudera manager. We are running MR1 on CDH 5.0.0-1.cdh5.0.0.p0.47 . Try to fix : The configurations in official CDH documentation about 'Using the LZO Parcel' are also added but the problem is still there. Most of the googled posts give similar advices to the

Cannot Read a file from HDFS using Spark

▼魔方 西西 提交于 2019-11-26 15:44:12
问题 I have installed cloudera CDH 5 by using cloudera manager. I can easily do hadoop fs -ls /input/war-and-peace.txt hadoop fs -cat /input/war-and-peace.txt this above command will print the whole txt file on the console. now I start the spark shell and say val textFile = sc.textFile("hdfs://input/war-and-peace.txt") textFile.count Now I get an error Spark context available as sc. scala> val textFile = sc.textFile("hdfs://input/war-and-peace.txt") 2014-12-14 15:14:57,874 INFO [main] storage

Spark : how to run spark file from spark shell

别来无恙 提交于 2019-11-26 11:53:55
问题 I am using CDH 5.2. I am able to use spark-shell to run the commands. How can I run the file(file.spark) which contain spark commands. Is there any way to run/compile the scala programs in CDH 5.2 without sbt? Thanks in advance 回答1: To load an external file from spark-shell simply do :load PATH_TO_FILE This will call everything in your file. I don't have a solution for your SBT question though sorry :-) 回答2: In command line, you can use spark-shell -i file.scala to run code which is written

Error when connect to impala with JDBC under kerberos authrication

≡放荡痞女 提交于 2019-11-26 04:00:30
问题 I create a class SecureImpalaDataSource that extends DriverManagerDataSource, and use UserGroupInformation.doAs() to get a Connection to impala with keytab file. But I get the error as follow: java.sql.SQLException: [Simba]ImpalaJDBCDriver Error initialized or created transport for authentication: [Simba]ImpalaJDBCDriver Unable to connect to server: null. But I am successful when I get the connection with kerberos ticket cache in a test demo. Anyone can help me? 回答1: Forget about the Hadoop