convert streamed buffers to utf8-string

前端 未结 2 837
执笔经年
执笔经年 2020-11-30 17:34

I want to make a HTTP-request using node.js to load some text from a webserver. Since the response can contain much text (some Megabytes) I want to process each text chunk s

2条回答
  •  悲&欢浪女
    2020-11-30 18:09

    Single Buffer

    If you have a single Buffer you can use its toString method that will convert all or part of the binary contents to a string using a specific encoding. It defaults to utf8 if you don't provide a parameter, but I've explicitly set the encoding in this example.

    var req = http.request(reqOptions, function(res) {
        ...
    
        res.on('data', function(chunk) {
            var textChunk = chunk.toString('utf8');
            // process utf8 text chunk
        });
    });
    

    Streamed Buffers

    If you have streamed buffers like in the question above where the first byte of a multi-byte UTF8-character may be contained in the first Buffer (chunk) and the second byte in the second Buffer then you should use a StringDecoder. :

    var StringDecoder = require('string_decoder').StringDecoder;
    
    var req = http.request(reqOptions, function(res) {
        ...
        var decoder = new StringDecoder('utf8');
    
        res.on('data', function(chunk) {
            var textChunk = decoder.write(chunk);
            // process utf8 text chunk
        });
    });
    

    This way bytes of incomplete characters are buffered by the StringDecoder until all required bytes were written to the decoder.

提交回复
热议问题