shark-sql

Accessing Shark tables (Hive) from Scala (shark-shell)

孤者浪人 提交于 2020-01-21 14:10:06
问题 I have shark-0.8.0 which runs on hive-0.9.0 . I am able to program on Hive by invoking shark . I created a few tables and loaded them with data. Now, I am trying to access the data from these tables using Scala . I invoked the Scala shell using shark-shell . But when I try to select, I get an error that the table is not present. scala> val artists = sc.sql2rdd("select artist from default.lastfm") Hive history file=/tmp/hduser2/hive_job_log_hduser2_201405091617_1513149542.txt 151.738: [GC

Is LIMIT clause in HIVE really random?

落花浮王杯 提交于 2019-12-21 04:25:15
问题 The documentation of HIVE notes that LIMIT clause returns rows chosen at random . I have been running a SELECT table on a table with more than 800,000 records with LIMIT 1 , but it always return me the same record. I'm using the Shark distribution, and I am wondering whether this has got anything to do with this not expected behavior? Any thoughts would be appreciated. Thanks, Visakh 回答1: Even though the documentation states it returns rows at random, it's not actually true. It returns

Is LIMIT clause in HIVE really random?

穿精又带淫゛_ 提交于 2019-12-03 13:19:55
The documentation of HIVE notes that LIMIT clause returns rows chosen at random . I have been running a SELECT table on a table with more than 800,000 records with LIMIT 1 , but it always return me the same record. I'm using the Shark distribution, and I am wondering whether this has got anything to do with this not expected behavior? Any thoughts would be appreciated. Thanks, Visakh Even though the documentation states it returns rows at random, it's not actually true. It returns "chosen rows at random" as it appears in the database without any where/order by clause. This means that it's not