Reading file from HDFS using Spring batch

你离开我真会死。 提交于 2019-12-04 12:37:06

The FlatFileItemReader in Spring Batch works with any Spring Framework Resource implementation:

@Bean
public FlatFileItemReader<String> itemReader() {
    Resource resource; // get (or autowire) resource
    return new FlatFileItemReaderBuilder<String>()
            .resource(resource)
            // set other reader properties
            .build();
}

So if you manage to have a Resource handle pointing to a HDFS file, your are done.

Now in order to have a HDFS resource, you can:

  • Use Spring for Hadoop. Once the HDFS file system is configured, you would be able to get the resource from the application context with applicationContext.getResource("hdfs:data.csv");
  • Implement your own Resource using Hadoop APIs (like shown in the answer by Michael Simons). I see that some folks already did this here

Hope this helps.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!