Spring Batch - Read files from Aws S3

后端 未结 3 1896
萌比男神i
萌比男神i 2021-01-05 07:09

I am trying to read files from AWS S3 and process it with Spring Batch:

Can a Spring Itemreader process this Task? If so, How do I pass the credentials to S3 client

相关标签:
3条回答
  • 2021-01-05 07:17

    More simple steps are:

    1. Create AWSS3 client bean.
    2. Create ResourceLoader bean.
    3. Use ResourceLoader to set S3 resources.

    Firstly, you need to create AWSS3 client and ResourceLoader bean in your aws configuration file, like this.

    @Configuration
    @EnableContextResourceLoader
    public class AWSConfiguration {
    
    @Bean
    @Primary
    public AmazonS3 getAmazonS3Cient() {
    
        ClientConfiguration config = new ClientConfiguration();
        
        config.setConnectionTimeout(5000 * 10);
        config.setSocketTimeout(5000 * 10);
    
        return AmazonS3ClientBuilder.standard()
                .withClientConfiguration(config).build();
    }
    
    
    @Bean
    @Autowired
    public static ResourceLoaderBeanPostProcessor resourceLoaderBeanPostProcessor(
            AmazonS3 amazonS3EncryptionClient) {
        return new ResourceLoaderBeanPostProcessor(amazonS3EncryptionClient);
    }
    
    }
    

    Then use resourceloader bean in ItemReader to set S3 resources.

    @Autowired
    private ResourceLoader resourceLoader;
    
    @Bean
    public FlatFileItemReader<String> fileItemReader() {
    
    FlatFileItemReader<String> reader = new FlatFileItemReader<>();
    reader.setLineMapper(new JsonLineMapper()); //Change line mapper as per your need
    reader.setResource(resourceLoader.getResource("s3://" + amazonS3Bucket + "/" + file));
    return reader;
    }
    
    0 讨论(0)
  • 2021-01-05 07:28

    Update To use the Spring-cloud-AWS you would still use the FlatFileItemReader but now you don't need to make a custom extended Resource.

    Instead you set up a aws-context and give it your S3Client bean.

        <aws-context:context-resource-loader amazon-s3="amazonS3Client"/>
    

    The reader would be set up like any other reader - the only thing that's unique here is that you would now autowire your ResourceLoader

    @Autowired
    private ResourceLoader resourceLoader;
    

    and then set that resourceloader:

    @Bean
    public FlatFileItemReader<Map<String, Object>> AwsItemReader() {
        FlatFileItemReader<Map<String, Object>> reader = new FlatFileItemReader<>();
        reader.setLineMapper(new JsonLineMapper());
        reader.setRecordSeparatorPolicy(new JsonRecordSeparatorPolicy());
        reader.setResource(resourceLoader.getResource("s3://" + amazonS3Bucket + "/" + file));
        return reader;
    }
    

    I would use the FlatFileItemReader and the customization that needs to take place is making your own S3 Resource object. Extend Spring's AbstractResource to create your own AWS resource that contains the AmazonS3 Client, bucket and file path info etc..

    For the getInputStream use the Java SDK:

            S3Object object = s3Client.getObject(new GetObjectRequest(bucket, awsFilePath));
            return object.getObjectContent();
    

    Then for contentLength -

    return s3Client.getObjectMetadata(bucket, awsFilePath).getContentLength();
    

    and lastModified use

    .getLastModified().getTime();
    

    The Resource you make will have the AmazonS3Client which contains all the info your spring-batch app needs to communicate with S3. Here's what it could look like with Java config.

        reader.setResource(new AmazonS3Resource(amazonS3Client, amazonS3Bucket, inputFile));
    
    0 讨论(0)
  • 2021-01-05 07:31

    Another way to read from S3 through FlatFileItemReader is to set Resouce as InputStream Resouce and then use s3client putobject to upload the Stream.

    reader.setResource(new InputStreamResouce(inputstream));
    

    Once the stream is populated,

    s3client.putObject(bucketname,key,inputstream,metadata);
    
    0 讨论(0)
提交回复
热议问题