Using zlib filter with a socket pair

断了今生、忘了曾经 提交于 2019-12-04 06:17:53

Looking through the C source code, the problem is that the filter always lets zlib's deflate() function decide how much data to accumulate before producing compressed output. The deflate filter does not create a new data bucket to pass on unless deflate() outputs some data (see line 235) or the PSFS_FLAG_FLUSH_CLOSE flag bit is set (line 250). That's why you only see the header bytes until you close $in; the first call to deflate() outputs the two header bytes, so data->strm.avail_out is 2 and a new bucket is created for these two bytes to pass on.

Note that fflush() does not work because of a known issue with the zlib filter. See: Bug #48725 Support for flushing in zlib stream.

Unfortunately, there does not appear to be a nice work-around to this. I started writing a filter in PHP by extending php_user_filter, but quickly ran into the problem that php_user_filter does not expose the flag bits, only whether flags & PSFS_FLAG_FLUSH_CLOSE (the fourth parameter to the filter() method, a boolean argument commonly named $closing). You would need to modify the C sources yourself to fix Bug #48725. Alternatively, re-write it.

Personally I would consider re-writing it because there seems to be a few eyebrow-raising issues with the code:

  • status = deflate(&(data->strm), flags & PSFS_FLAG_FLUSH_CLOSE ? Z_FULL_FLUSH : (flags & PSFS_FLAG_FLUSH_INC ? Z_SYNC_FLUSH : Z_NO_FLUSH)); seems odd because when writing, I don't know why flags would be anything other than PSFS_FLAG_NORMAL. Is it possible to write & flush at the same time? In any case, handling the flags should be done outside of the while loop through the "in" bucket brigade, like how PSFS_FLAG_FLUSH_CLOSE is handled outside of this loop.
  • Line 221, the memcpy to data->strm.next_in seems to ignore the fact that data->strm.avail_in may be non-zero, so the compressed output might skip some data of a write. See, for example, the following text from the zlib manual:

    If not all input can be processed (because there is not enough room in the output buffer), next_in and avail_in are updated and processing will resume at this point for the next call of deflate().

    In other words, it is possible that avail_in is non-zero.

  • The if statement on line 235, if (data->strm.avail_out < data->outbuf_len) should probably be if (data->strm.avail_out) or perhaps if (data->strm.avail_out > 2).
  • I'm not sure why *bytes_consumed = consumed; isn't *bytes_consumed += consumed;. The example streams at http://www.php.net/manual/en/function.stream-filter-register.php all use += to update $consumed.

EDIT: *bytes_consumed = consumed; is correct. The standard filter implementations all use = rather than += to update the size_t value pointed to by the fifth parameter. Also, even though $consumed += ... on the PHP side effectively translates to += on the size_t (see lines 206 and 231 of ext/standard/user_filters.c), the native filter function is called with either a NULL pointer or a pointer to a size_t set to 0 for the fifth argument (see lines 361 and 452 of main/streams/filter.c).

mop

I had worked on the PHP source code and found a fix.

To understand what happens I had traced the code during a

....
for ($i = 0 ; $i < 3 ; $i++) {
    fwrite($s[0], ...);
    fread($s[1], ...);
    fflush($s[0], ...);
    fread($s[1], ...);
    }

loop and I found that the deflate function is never called with the Z_SYNC_FLUSH flag set because no new data are present into the backets_in brigade.

My fix is to manage the (PSFS_FLAG_FLUSH_INC flag is set AND no iterations are performed on deflate function case) extending the

if (flags & PSFS_FLAG_FLUSH_CLOSE) {

managing FLUSH_INC too:

if (flags & PSFS_FLAG_FLUSH_CLOSE || (flags & PSFS_FLAG_FLUSH_INC && to_be_flushed)) {

This downloadable patch is for debian squeeze version of PHP but the current git version of the file is closer to it so I suppose to port the fix is simply (few lines).

If some side effect arises please contact me.

You need to close the stream after the write to flush it before the data will come in from the read.

list($in, $out) = stream_socket_pair(STREAM_PF_UNIX,
                                     STREAM_SOCK_STREAM,
                                     STREAM_IPPROTO_IP);

$params = array('level' => 6, 'window' => 15, 'memory' => 9);

stream_filter_append($out, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
stream_set_blocking($out, 0);
stream_set_blocking($in, 0);

fwrite($out, 'Some big long string.');
fclose($out);
$compressed = fread($in, 1024);
echo "Compressed:" . bin2hex($compressed) . "<br>\n";


list($in, $out) = stream_socket_pair(STREAM_PF_UNIX,
                                     STREAM_SOCK_STREAM,
                                     STREAM_IPPROTO_IP);

$params = array('level' => 6, 'window' => 15, 'memory' => 9);

stream_filter_append($out, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
stream_set_blocking($out, 0);
stream_set_blocking($in, 0);


fwrite($out, 'Some big long string, take two.');
fclose($out);
$compressed = fread($in, 1024);
echo "Compressed:" . bin2hex($compressed) . "<br>\n";

list($in, $out) = stream_socket_pair(STREAM_PF_UNIX,
                                     STREAM_SOCK_STREAM,
                                     STREAM_IPPROTO_IP);

$params = array('level' => 6, 'window' => 15, 'memory' => 9);

stream_filter_append($out, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
stream_set_blocking($out, 0);
stream_set_blocking($in, 0);

fwrite($out, 'Some big long string - third time is the charm?');
fclose($out);
$compressed = fread($in, 1024);
echo "Compressed:" . bin2hex($compressed) . "<br>\n";

That produces: Compressed:789c0bcecf4d5548ca4c57c8c9cf4b57282e29cacc4bd70300532b079c Compressed:789c0bcecf4d5548ca4c57c8c9cf4b57282e29cacc4bd7512849cc4e552829cfd70300b1b50b07 Compressed:789c0bcecf4d5548ca4c57c8c9cf4b57282e29ca0452ba0a25199945290a259940c9cc62202f55213923b128d71e008e4c108c

Also I switched the $in and $out because writing to $in confused me.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!