问题
It's pretty straightforward what I'm trying to do. I just need to count the records in multiple Hive tables.
I want to create a very simple hql script that takes a file.txt with table names as input and count the total number of records in each of them:
SELECT COUNT(*) from <tablename>
Output should be like:
table1 count1
table2 count2
table3 count3
I'm new to Hive and not very well versed in Unix scripting, and I'm unable to figure out how to create a script to perform this.
Can someone please help me in doing this? Thanks in advance.
回答1:
Simple working shell script:
db=mydb
for table in $(hive -S -e "use $db; show tables;")
do
#echo "$table"
hive -S -e "use $db; select '$table' as table_name, count(*) as cnt from $table;"
done
You can improve this script and generate file with select commands or even single select with union all, then execute file instead of calling Hive for each table.
If you want to read table names from file, use this:
for table in filename
do
...
done
来源:https://stackoverflow.com/questions/57089696/how-to-create-a-unix-script-to-loop-a-hive-select-query-by-taking-table-names-as