How to create a unix script to loop a Hive SELECT query by taking table names as input from a file?

与世无争的帅哥 提交于 2020-01-05 04:07:07

问题


It's pretty straightforward what I'm trying to do. I just need to count the records in multiple Hive tables.

I want to create a very simple hql script that takes a file.txt with table names as input and count the total number of records in each of them:

SELECT COUNT(*) from <tablename>

Output should be like:

table1 count1
table2 count2
table3 count3

I'm new to Hive and not very well versed in Unix scripting, and I'm unable to figure out how to create a script to perform this.

Can someone please help me in doing this? Thanks in advance.


回答1:


Simple working shell script:

 db=mydb

 for table in $(hive -S -e "use $db; show tables;") 
 do 
 #echo "$table"
 hive -S -e "use $db; select '$table' as table_name, count(*) as cnt from $table;"
 done

You can improve this script and generate file with select commands or even single select with union all, then execute file instead of calling Hive for each table.

If you want to read table names from file, use this:

for table in filename
do 
...
done


来源:https://stackoverflow.com/questions/57089696/how-to-create-a-unix-script-to-loop-a-hive-select-query-by-taking-table-names-as

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!