hdinsight

Hive LLAP doesn't work with Parquet format

萝らか妹 提交于 2020-01-02 22:40:41
问题 After finding out Hive LLAP, I really want to use it. I started Azure HDinsight cluster with LLAP enabled. However, it doesn't seem to work any better than normal Hive. I have data stored in Parquet files. I only see ORC files mentioned in LLAP related docs or talks. Does it also support Parquet format? 回答1: Answering my own question. We reached out to Azure support. Hive LLAP only works with ORC file format (as of 05.2017). So with Parquet either we have to use Apache Impala for fast

“Not a file” exception on select after successful insert

三世轮回 提交于 2019-12-31 05:48:06
问题 I have created a table: DROP TABLE IF EXISTS sampleout; CREATE EXTERNAL TABLE sampleout( id bigint, LNG FLOAT, LAT FLOAT, GMTDateTime TIMESTAMP, calculatedcolumn FLOAT ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION 'wasb://sampleout@xxxxxx.blob.core.windows.net/'; I then got a success from this query: INSERT into TABLE sampleout select *, 0 as calculatedcolumn from sampletable sampleout is the same as sampletable except for the extra column calculatedcolumn .

Double Quotes in Hadoop Hive Query

柔情痞子 提交于 2019-12-25 05:33:46
问题 I am able to use Double Quotes in following query -> $subscriptionName = "***" $clusterName = "***" $queryString = "SELECT city FROM logs WHERE city =""New York"";" Use-AzureHDInsightCluster $clusterName Invoke-Hive -Query $queryString But I am not able to use Quotes in following PowerShell Comamnds - $subscriptionName = "***" $storageAccountName = "***" $containerName = "***" $clusterName = "***" $queryString = "SELECT city FROM logs WHERE city =""New York"";" $hiveJobDefinition = New

Can we use HDInsight Service for ATS?

喜欢而已 提交于 2019-12-24 16:25:32
问题 We have a logging system called as Xtrace. We use this system to dump logs, exceptions, traces etc. in SQL Azure database. Ops team then uses this data for debugging, SCOM purpose. Considering the 150 GB limitation that SQL Azure has we are thinking of using HDInsight (Big Data) Service. If we dump the data in Azure Table Storage, will HDInsight Service work against ATS? Or it will work only against the blob storage, which means the log records need to be created as files on blob storage?

The subscription being used exceeds the cpu cores quota

不想你离开。 提交于 2019-12-24 10:39:23
问题 I keep on getting the above message when I am trying to set up a new HDInsight HBase cluster. However I am only trying to use 1 core in the new cluster and according to the Azure portal (when I go to Settings-Usage (see screenshot below) I am using 0% of 40 cores. Does anybody know how to resolve this? 回答1: HDInsight core limits are calculated separately from the cores that are shown in the Settings-Usage tab. If you click on one of your existing HDInsight clusters, you should see a graphic

In Hive, how can I add a column only if that column does not exist?

↘锁芯ラ 提交于 2019-12-23 07:55:58
问题 I would like to add a new column to a table, but only if that column does not already exist. This works if the column does not exist: ALTER TABLE MyTable ADD COLUMNS (mycolumn string); But when I execute it a second time, I get an error. Column 'mycolumn' exists When I try to use the "IF NOT EXISTS" syntax that is supported for CREATE TABLE and ADD PARTITION, I get a syntax error: ALTER TABLE MyTable ADD IF NOT EXISTS COLUMNS (mycolumn string); FAILED: ParseException line 3:42 required (...)+

How to setup custom Spark parameter in HDInsights cluster with Data Factory

谁说胖子不能爱 提交于 2019-12-22 18:17:29
问题 I am creating HDInsights cluster on Azure according to this desciption Now I would like to set up spark custom parameter, for example spark.yarn.appMasterEnv.PYSPARK3_PYTHON or spark_daemon_memory in time of cluster provisioning. Is it possible to setup using Data Factory/Automation Account? I can not find any example doing this. Thanks 回答1: You can use SparkConfig in Data Factory to pass these configurations to Spark. For example: "typeProperties": { ... "sparkConfig": { "spark.submit

AzureException: Unable to access container using anonymous credentials, and no credentials found for them in the configuration

不问归期 提交于 2019-12-21 12:32:00
问题 I am trying to use Hadoop of Azure HDInsight. I am logging into the cluster by ssh and running the following hadoop jar jar_name class_name wasb://container@storagename.core.windows.net/inputdir wasb://container@storagename.core.windows.net/outputdir But I get the following exception: Exception in thread "main" org.apache.hadoop.fs.azure.AzureException: org.apache.hadoop.fs.azure.AzureException: Unable to access container xxx in account yyy.core.windows.net using anonymous credentials, and no

How to connect Hive to asp.net project

旧城冷巷雨未停 提交于 2019-12-19 11:27:13
问题 Hi I'm very new to Hadoop. I have installed Microsoft HDInsight to my local system. Now I want to connect to hive and HBase but for HIVE connection I have to specify Connection string, port, username, password. But I'm not able to figure out how I will get this value. I have tried with localhost and 8085 as a port but this doesn't work. I also done it by giving localhost IP and my system IP too. Please help with this and let me know how i should proceed for HBase connectivity 回答1: Your best

Read a json file with 12 nested level into hive in AZURE hdinsights

一世执手 提交于 2019-12-14 01:05:09
问题 I tried to create a schema for the json file manually and tried to create a Hive table and i am getting column type name length 10888 exceeds max allowed length 2000 . I am guessing i have to change the metastore details but i am not sure where is the config located In azure Hdinsights . Other way I tried was I got the schema from spark dataframe and i tried to create table from the view but still I get the same error. this are the steps i tried in spark val tne1 = sc.wholeTextFiles("wasb