Better approach to call external API in apache beam

这一生的挚爱 提交于 2021-02-11 14:39:47

问题


I have 2 approaches to initialize the HttpClient in order to make an API call from a ParDo in Apache Beam.

Approach 1:

Initialise the HttpClient object in the StartBundle and close the HttpClient in FinishBundle. The code is as follows:

public class ProcessNewIncomingRequest extends DoFn<String, KV<String, String>> {
        
        @StartBundle
        public void startBundle() {
            HttpClient client = HttpClient.newHttpClient();
            HttpRequest request = HttpRequest.newBuilder()
                                       .uri(URI.create(<Custom_URL>))
                                       .build();

        }
        
        @ProcessElement
        public void processElement(){
            // Use the client and do an external API call
        }

        @FinishBundle
        public void finishBundle(){
             httpClient.close();
        }
}

Approach 2:

Have a separate Class where all the connections are managed using the connection pool.

public class ExternalConnection{

       HttpClient client = HttpClient.newHttpClient();
       HttpRequest request = HttpRequest.newBuilder()
                                       .uri(URI.create(<Custom_URL>))
                                       .build();
       
       public Response getResponse(){
             // use the client, send request and get response 
       }
       
}

public class ProcessNewIncomingRequest extends DoFn<String, KV<String, String>> {
        
        @ProcessElement
        public void processElement(){
             Response response = new ExternalConnection().getResponse();
        }
}

Which one of the above 2 approaches are better in terms of performance and coding design standards?


回答1:


Either approach would work fine; the StartBundle/FinishBundle one is more contained IMHO but has the disadvantage of not working well if your bundles are very small. An even better approach might be to use DoFn's SetUp/TearDown which can span an arbitrary number of bundles, but is tied to the lifetime of the DoFn (leveraging the pooling of DoFn instances the Beam SDKs already do).



来源:https://stackoverflow.com/questions/62778595/better-approach-to-call-external-api-in-apache-beam

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!