oozie Sqoop action fails to import data to hive

与世无争的帅哥 提交于 2019-12-05 18:50:20

This seems like a typical Sqoop import to Hive job. So it seems like Sqoop has successfully imported data in HDFS and is failing to load that data into Hive.

Here's some background on what's happening... Oozie launches a separate job (which will execute on any node in your hadoop cluster) to run the Sqoop command. The Sqoop command starts a separate job to load data into HDFS. Then, at the end of the Sqoop job, sqoop runs a hive script to load that data into Hive.

Since this is theoretically running from any node in your Hadoop cluster, hive CLI will need to be available on each node and talk to the same metastore. The Hive Metastore will need to run in remote mode.

The most normal problem is because Sqoop cannot talk to the correct metastore. The main reasons for this are normally:

  1. Hive metastore service is not running. It should be running in remote mode and a separate service should be started. Here's a quick way to check if its running:

    service hive-metastore status

  2. hive-site.xml does not contain hive.metastore.uris. Here's an example hive-site.xml with hive.metastore.uris set:

    <configuration>
    ...
      <property>
        <name>hive.metastore.uris</name>
        <value>thrift://sqoop2.example.com:9083</value>
      </property>
    ...
    </configuration>
    
  3. hive-site.xml is not included in your Sqoop action (or its properties). Try adding your hive-site.xml to a <file> element in your Sqoop action. Here's an example workflow.xml with <file> in it:

    <workflow-app name="sqoop-to-hive" xmlns="uri:oozie:workflow:0.4">
        ...
        <action name="sqoop2hive">
            ...
            <sqoop xmlns="uri:oozie:sqoop-action:0.2">
                ...
                <file>/tmp/hive-site.xml#hive-site.xml</file>
            </sqoop>
            ...
        </action>
        ...
    </workflow-app>
    

This seems to be a bug in Sqoop. Am not sure about the JIRA#. Hortonworks mentioned that the issue is still not resolved even in HDP 2.2 version.

@abeaamase - I want try to use your solution.

Just want to check if below solution works good for sqoop + Hive import in one single oozie job?

... ... ... /tmp/hive-site.xml#hive-site.xml ... ...

If you are using cdh then problem may be due to hive metastore jar dependency conflicts.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!