Catch-all servlet filter that should capture ALL HTML input content for manipulation, works only intermittently

前端 未结 1 1158
不思量自难忘°
不思量自难忘° 2020-12-15 12:36

I need a servlet filter that will capture all input, then mangle that input, inserting a special token in every form. Imagine that the filter is tied to all requests (E.g.

相关标签:
1条回答
  • 2020-12-15 13:40

    Your initial approach failed because PrintWriter wraps the given ByteArrayOutputStream with a BufferedWriter which has an internal character buffer of 8192 characters, and you never flush() the buffer before getting the bytes from the ByteArrayOutputStream. In other words, when less than ~8KB of data is written to the getWriter() of the response, the wrapped ByteArrayOutputStream actually never get filled; namely everything is still in that internal character buffer, waiting to be flushed.

    A fix would be to perform a flush() call before toByteArray() in your MyPrintWriter:

    byte[] toByteArray() {
        pw.flush();
        return baos.toByteArray();
    }
    

    This way the internal character buffer will be flushed (i.e. it will actually write everything to the wrapped stream). This also totally explains why it works when you write to getOutputStream(), this step namely doesn't use the PrintWriter and nothing gets buffered in some internal buffer.


    Unrelated to the concrete problem: this approach has some severe problems. It isn't respecting the response character encoding during construction of PrintWriter (you should actually wrap the ByteArrayOutputStream in an OutputStreamWriter instead which can take a character encoding) and relying on the platform default, in other words, any written Unicode characters may end up in Mojibake this way and thus this approach isn't ready for World Domination.

    Also, this approach makes it possible to call both getWriter() and getOutputStream() on the same response, while that's considered an illegal state (precisely to avoid this kind of buffering and encoding trouble).


    Update as per the comment, here's a full rewrite of the response wrapper, showing the right way, hopefully in a more self-explaining way than the code you've so far:

    public class CapturingResponseWrapper extends HttpServletResponseWrapper {
    
        private final ByteArrayOutputStream capture;
        private ServletOutputStream output;
        private PrintWriter writer;
    
        public CapturingResponseWrapper(HttpServletResponse response) {
            super(response);
            capture = new ByteArrayOutputStream(response.getBufferSize());
        }
    
        @Override
        public ServletOutputStream getOutputStream() {
            if (writer != null) {
                throw new IllegalStateException("getWriter() has already been called on this response.");
            }
    
            if (output == null) {
                output = new ServletOutputStream() {
                    @Override
                    public void write(int b) throws IOException {
                        capture.write(b);
                    }
                    @Override
                    public void flush() throws IOException {
                        capture.flush();
                    }
                    @Override
                    public void close() throws IOException {
                        capture.close();
                    }
                };
            }
    
            return output;
        }
    
        @Override
        public PrintWriter getWriter() throws IOException {
            if (output != null) {
                throw new IllegalStateException("getOutputStream() has already been called on this response.");
            }
    
            if (writer == null) {
                writer = new PrintWriter(new OutputStreamWriter(capture, getCharacterEncoding()));
            }
    
            return writer;
        }
    
        @Override
        public void flushBuffer() throws IOException {
            super.flushBuffer();
    
            if (writer != null) {
                writer.flush();
            }
            else if (output != null) {
                output.flush();
            }
        }
    
        public byte[] getCaptureAsBytes() throws IOException {
            if (writer != null) {
                writer.close();
            }
            else if (output != null) {
                output.close();
            }
    
            return capture.toByteArray();
        }
    
        public String getCaptureAsString() throws IOException {
            return new String(getCaptureAsBytes(), getCharacterEncoding());
        }
    
    }
    

    Here's how you're supposed to use it:

    @Override
    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        CapturingResponseWrapper capturingResponseWrapper = new CapturingResponseWrapper((HttpServletResponse) response);
        chain.doFilter(request, capturingResponseWrapper);
        String content = capturingResponseWrapper.getCaptureAsString(); // This uses response character encoding.
        String replacedContent = content.replaceAll("(?i)</form(\\s)*>", "<input type=\"hidden\" name=\"zval\" value=\"fromSiteZ123\"/></form>");
        response.getWriter().write(replacedContent); // Don't ever use String#getBytes() without specifying character encoding!
    }
    
    0 讨论(0)
提交回复
热议问题