How to point to a single file with external table

血红的双手。 提交于 2021-01-04 07:21:50

问题


Im trying to load hdfs data as external but get the following error.

The folder ml-100k has multiple datasets with different datasets, so I just need to load that particular file.

hive> create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data'
    > ;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data is not a directory or unable to create one)

回答1:


You cannot create a table that points to a file, only to a directory, but there is a feature/bug that allows you to alter the location to a specific file.

create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k';

alter table movie_ratings set location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data';



回答2:


You cannot create a Hive table over a specific file, you need to give a directory. So you can create a subdirectory under ml-100k/ and use it like this :

create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/new_subfilder/'

The bug mentioned by @Dudu may solve a specific case, but its not safe for general use, because inserting into such table will create new files and will never append the specified one !



来源:https://stackoverflow.com/questions/42583106/how-to-point-to-a-single-file-with-external-table

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!