How to correctly read Flux and convert it to a single inputStream

前端 未结 6 418
旧巷少年郎
旧巷少年郎 2020-12-05 07:13

I\'m using WebClient and custom BodyExtractorclass for my spring-boot application

WebClient webLCient = WebClient.create();
webClie         


        
相关标签:
6条回答
  • 2020-12-05 07:38

    This is really not as complicated as other answers imply.

    The only way to stream the data without buffering it all in memory is to use a pipe, as @jin-kwon suggested. However, it can be done very simply by using Spring's BodyExtractors and DataBufferUtils utility classes.

    Example:

    private InputStream readAsInputStream(String url) throws IOException {
        PipedOutputStream osPipe = new PipedOutputStream();
        PipedInputSteam isPipe = new PipedInputStream(osPipe);
    
        ClientResponse response = webClient.get().uri(url)
            .accept(MediaType.APPLICATION.XML)
            .exchange()
            .block();
        final int statusCode = response.rawStatusCode();
        // check HTTP status code, can throw exception if needed
        // ....
    
        Flux<DataBuffer> body = response.body(BodyExtractors.toDataBuffers())
            .doOnError(t -> {
                log.error("Error reading body.", t);
                // close pipe to force InputStream to error,
                // otherwise the returned InputStream will hang forever if an error occurs
                try(isPipe) {
                  //no-op
                } catch (IOException ioe) {
                    log.error("Error closing streams", ioe);
                }
            })
            .doFinally(s -> {
                try(osPipe) {
                  //no-op
                } catch (IOException ioe) {
                    log.error("Error closing streams", ioe);
                }
            });
    
        DataBufferUtils.write(body, osPipe)
            .subscribe(DataBufferUtils.releaseConsumer());
    
        return isPipe;
    }
    

    If you don't care about checking the response code or throwing an exception for a failure status code, you can skip the block() call and intermediate ClientResponse variable by using

    flatMap(r -> r.body(BodyExtractors.toDataBuffers()))
    

    instead.

    0 讨论(0)
  • 2020-12-05 07:38

    You can use pipes.

    static <R> Mono<R> pipeAndApply(
            final Publisher<DataBuffer> source, final Executor executor,
            final Function<? super ReadableByteChannel, ? extends R> function) {
        return using(Pipe::open,
                     p -> {
                         executor.execute(() -> write(source, p.sink())
                                 .doFinally(s -> {
                                     try {
                                         p.sink().close();
                                     } catch (final IOException ioe) {
                                         log.error("failed to close pipe.sink", ioe);
                                         throw new RuntimeException(ioe);
                                     }
                                 })
                                 .subscribe(releaseConsumer()));
                         return just(function.apply(p.source()));
                     },
                     p -> {
                         try {
                             p.source().close();
                         } catch (final IOException ioe) {
                             log.error("failed to close pipe.source", ioe);
                             throw new RuntimeException(ioe);
                         }
                     });
    }
    

    Or using CompletableFuture,

    static <R> Mono<R> pipeAndApply(
            final Publisher<DataBuffer> source,
            final Function<? super ReadableByteChannel, ? extends R> function) {
        return using(Pipe::open,
                     p -> fromFuture(supplyAsync(() -> function.apply(p.source())))
                             .doFirst(() -> write(source, p.sink())
                                     .doFinally(s -> {
                                         try {
                                             p.sink().close();
                                         } catch (final IOException ioe) {
                                             log.error("failed to close pipe.sink", ioe);
                                             throw new RuntimeException(ioe);
                                         }
                                     })
                                     .subscribe(releaseConsumer())),
                     p -> {
                         try {
                             p.source().close();
                         } catch (final IOException ioe) {
                             log.error("failed to close pipe.source", ioe);
                             throw new RuntimeException(ioe);
                         }
                     });
    }
    
    0 讨论(0)
  • 2020-12-05 07:39

    Here comes another variant from other answers. And it's still not memory-friendly.

    static Mono<InputStream> asStream(WebClient.ResponseSpec response) {
        return response.bodyToFlux(DataBuffer.class)
            .map(b -> b.asInputStream(true))
            .reduce(SequenceInputStream::new);
    }
    
    static void doSome(WebClient.ResponseSpec response) {
        asStream(response)
            .doOnNext(stream -> {
                // do some with stream
            })
            .block();
    }
    
    0 讨论(0)
  • 2020-12-05 07:47

    I was able to make it work by using Flux#collect and SequenceInputStream

    @Override
    public Mono<T> extract(ClientHttpResponse response, BodyExtractor.Context context) {
      Flux<DataBuffer> body = response.getBody();
      return body.collect(InputStreamCollector::new, (t, dataBuffer)-> t.collectInputStream(dataBuffer.asInputStream))
        .map(inputStream -> {
          try {
            JaxBContext jc = JaxBContext.newInstance(SomeClass.class);
            Unmarshaller unmarshaller = jc.createUnmarshaller();
    
            return (T) unmarshaller.unmarshal(inputStream);
          } catch(Exception e){
            return null;
          }
      }).next();
    }
    

    InputStreamCollector.java

    public class InputStreamCollector {
      private InputStream is;
    
      public void collectInputStream(InputStream is) {
        if (this.is == null) this.is = is;
        this.is = new SequenceInputStream(this.is, is);
      }
    
      public InputStream getInputStream() {
        return this.is;
      }
    }
    
    0 讨论(0)
  • 2020-12-05 07:52

    A slightly modified version of Bk Santiago's answer makes use of reduce() instead of collect(). Very similar, but doesn't require an extra class:

    Java:

    body.reduce(new InputStream() {
        public int read() { return -1; }
      }, (s: InputStream, d: DataBuffer) -> new SequenceInputStream(s, d.asInputStream())
    ).flatMap(inputStream -> /* do something with single InputStream */
    

    Or Kotlin:

    body.reduce(object : InputStream() {
      override fun read() = -1
    }) { s: InputStream, d -> SequenceInputStream(s, d.asInputStream()) }
      .flatMap { inputStream -> /* do something with single InputStream */ }
    

    Benefit of this approach over using collect() is simply you don't need to have a different class to gather things up.

    I created a new empty InputStream(), but if that syntax is confusing, you can also replace it with ByteArrayInputStream("".toByteArray()) instead to create an empty ByteArrayInputStream as your initial value instead.

    0 讨论(0)
  • 2020-12-05 08:03

    There's a much cleaner way to do this using the underlying reactor-netty HttpClient directly, instead of using WebClient. The composition hierarchy is like this:

    WebClient -uses-> HttpClient -uses-> TcpClient
    

    Easier to show code than explain:

    HttpClient.create()
        .get()
        .responseContent() // ByteBufFlux
        .aggregate() // ByteBufMono
        .asInputStream() // Mono<InputStream>
        .block() // We got an InputStream, yay!
    

    However, as I've pointed out already, using InputStream is a blocking operation, that defeats the purpose of using a non-blocking HTTP client, not to mention aggregating the whole response. See this for a Java NIO vs. IO comparison.

    0 讨论(0)
提交回复
热议问题