Validate a Sqoop with use of QUERY and WHERE clauses

戏子无情 提交于 2019-12-10 18:15:19

问题


I am ope-rationalizing a data import process that takes data from an existing database and partitions it within a scheme of HDFS. By default, the job is split into four map processes, and right now I have the job configured to do this on a daily interval through Apache Oozie.

Since Oozie is DAG oriented, is there the capacity to create a validationStep within the Oozie workflow such that:

  • Run HIVE query on newly imported data to return count of rows
  • Run SQL query to return count of rows in original source of data
  • Compare the two values
  • If not match, return FAIL and KILL JOB, if match, return TRUE and OK

I understand there is a validate process within sqoop, but it is my understanding that since I am not running this against a single table that this is not applicable (each of my sqoop import is partitioned by a specific date).

Is this possible?

来源:https://stackoverflow.com/questions/25432612/validate-a-sqoop-with-use-of-query-and-where-clauses

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!