Hadoop

How to create a HIVE table to read semicolon separated values

懵懂的女人 提交于 2020-12-30 17:08:14
问题 I want to create a HIVE table that will read in semicolon separated values, but my code keeps giving me errors. Does anyone have any suggestions? CREATE TABLE test_details(Time STRING, Vital STRING, sID STRING) PARTITIONED BY(Country STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ';' STORED AS TEXTFILE; 回答1: For me nothing worked except this: FIELDS TERMINATED BY '\u0059' Edit: After updating Hive: FIELDS TERMINATED BY '\u003B' so in full: CREATE TABLE test_details(Time STRING, Vital

京东城市时空数据引擎JUST亮相中国数据库技术大会

点点圈 提交于 2020-12-30 16:59:04
受疫情影响,第十一届中国数据库技术大会(DTCC 2020)从原定的5月份,推迟到了8月份,再推迟到了12月份。尽管如此,依然没有减退国人对数据库技术的热情。2020年12月21日-12月23日,北京国际会议中心人头攒动,各大厂商争奇斗艳。在NoSQL技术专场,京东智能城市研究院的李瑞远博士给大家带来了《京东城市时空数据引擎JUST的架构设计与应用实践》的主题报告,受到了大家的广泛关注。 李瑞远博士的个人简介:李瑞远,博士,京东城市时空数据组负责人,京东智能城市研究院研究员,京东智能城市事业部数据科学家,负责时空数据平台架构设计、时空索引与分布式相结合研究、时空数据产品的研发、以及时空数据挖掘在城市场景的落地等工作。加入京东之前,在微软亚洲研究院城市计算组实习/工作4年。研究兴趣包括:时空数据管理与挖掘、分布式计算和城市计算。在国内外高水平期刊和国际会议上发表论文20余篇,包括:KDD、Artificial Intelligence、ICDE、AAAI、TKDE、WWW、UbiComp、软件学报等。申请专利20余项。现为中国计算机学会(CCF)会员、CCF数据库专委会通讯委员、IEEE会员。先后担任多个国内外顶级会议或期刊的论文审稿人。 JUST简介:时空数据蕴含着丰富的信息,能够应用于各种城市应用。但时空数据更新频率高、数据体量大、结构复杂,难以被高效存储、管理和分析

How to find optimal number of mappers when running Sqoop import and export?

别来无恙 提交于 2020-12-30 07:50:33
问题 I'm using Sqoop version 1.4.2 and Oracle database. When running Sqoop command. For example like this: ./sqoop import \ --fs <name node> \ --jt <job tracker> \ --connect <JDBC string> \ --username <user> --password <password> \ --table <table> --split-by <cool column> \ --target-dir <where> \ --verbose --m 2 We can specify --m - how many parallel tasks do we want Sqoop to run (also they might be accessing Database at same time). Same option is available for ./sqoop export <...> Is there some

hive集成kerberos

ぃ、小莉子 提交于 2020-12-29 17:14:33
1、 票据的生成 kdc 服务器操作,生成用于 hive 身份验证的 principal 1.1 、创建 principal # kadmin.local -q “addprinc -randkey hive/yjt” 1.2 、创建秘钥文件 # kadmin.local -q “xst -norankey -k /etc/hive.keytab hive/yjt” 拷贝秘钥文件到集群 ,root 操作或者有 root 权限的普通用户操作 # scp /etc/hive.keytab 192.168.0.230:/data1/hadoop/hive/conf 连接到集群,修改文件权限 # chown hduser:hadoop /data1/hadoop/hive/conf/hive.keytab # chomd 400 /data1/hadoop/hive/conf/hive.keytab 1.3 、修改配置文件 Hive-site.xml 配置文件添加如下信息: <property> <name>hive.server2.authentication</name> <value>KERBEROS</value> </property> <property> <name>hive.server2.authentication.kerberos.principal</name>

java访问配置kerberos后的hive

浪尽此生 提交于 2020-12-29 16:30:54
1.准备 1.1 生成主体hive/hive的keytab文件到指定目录/var/keytab/hive.keytab [root@fan102 ~]# kadmin.local -q "xst -k /var/keytab/hive.keytab hive/hive@HADOOP.COM" 1.2 查看keytab内容 [root@fan102 ~]# cd var/keytab [root@fan102 keytab]# klist -e -k hive.keytab Keytab name: FILE:hive.keytab KVNO Principal ---- -------------------------------------------------------------------------- 3 hive/hive@HADOOP.COM (aes128-cts-hmac-sha1-96) 3 hive/hive@HADOOP.COM (des3-cbc-sha1) 3 hive/hive@HADOOP.COM (arcfour-hmac) 3 hive/hive@HADOOP.COM (camellia256-cts-cmac) 3 hive/hive@HADOOP.COM (camellia128-cts-cmac) 3 hive/hive@HADOOP

Hive Utf-8 Encoding number of characters supported?

◇◆丶佛笑我妖孽 提交于 2020-12-29 12:13:06
问题 Hi actually the problem is as follows the data i want to insert in hive table has latin words and its in utf-8 encoded format. But still hive does not display it properly. Actual Data:- Data Inserted in hive I changed the encoding of the table to utf-8 as well still same issue below are the hive DDL and commands CREATE TABLE IF NOT EXISTS test6 ( CONTACT_RECORD_ID string, ACCOUNT string, CUST string, NUMBER string, NUMBER1 string, NUMBER2 string, NUMBER3 string, NUMBER4 string, NUMBER5 string

Hive Utf-8 Encoding number of characters supported?

余生颓废 提交于 2020-12-29 12:10:12
问题 Hi actually the problem is as follows the data i want to insert in hive table has latin words and its in utf-8 encoded format. But still hive does not display it properly. Actual Data:- Data Inserted in hive I changed the encoding of the table to utf-8 as well still same issue below are the hive DDL and commands CREATE TABLE IF NOT EXISTS test6 ( CONTACT_RECORD_ID string, ACCOUNT string, CUST string, NUMBER string, NUMBER1 string, NUMBER2 string, NUMBER3 string, NUMBER4 string, NUMBER5 string

Hive Utf-8 Encoding number of characters supported?

ⅰ亾dé卋堺 提交于 2020-12-29 12:09:20
问题 Hi actually the problem is as follows the data i want to insert in hive table has latin words and its in utf-8 encoded format. But still hive does not display it properly. Actual Data:- Data Inserted in hive I changed the encoding of the table to utf-8 as well still same issue below are the hive DDL and commands CREATE TABLE IF NOT EXISTS test6 ( CONTACT_RECORD_ID string, ACCOUNT string, CUST string, NUMBER string, NUMBER1 string, NUMBER2 string, NUMBER3 string, NUMBER4 string, NUMBER5 string

Hive Utf-8 Encoding number of characters supported?

☆樱花仙子☆ 提交于 2020-12-29 12:07:55
问题 Hi actually the problem is as follows the data i want to insert in hive table has latin words and its in utf-8 encoded format. But still hive does not display it properly. Actual Data:- Data Inserted in hive I changed the encoding of the table to utf-8 as well still same issue below are the hive DDL and commands CREATE TABLE IF NOT EXISTS test6 ( CONTACT_RECORD_ID string, ACCOUNT string, CUST string, NUMBER string, NUMBER1 string, NUMBER2 string, NUMBER3 string, NUMBER4 string, NUMBER5 string

Number of reducers in hadoop

旧巷老猫 提交于 2020-12-29 10:01:51
问题 I was learning hadoop, I found number of reducers very confusing : 1) Number of reducers is same as number of partitions. 2) Number of reducers is 0.95 or 1.75 multiplied by (no. of nodes) * (no. of maximum containers per node). 3) Number of reducers is set by mapred.reduce.tasks . 4) Number of reducers is closest to: A multiple of the block size * A task time between 5 and 15 minutes * Creates the fewest files possible. I am very confused, Do we explicitly set number of reducers or it is