nodejs encoding using request

后端 未结 3 640
面向向阳花
面向向阳花 2020-12-09 18:40

I am trying to get the correct encoding with request.

request.get({
    \"uri\":\'http://www.bold.dk/tv/\',
    \"encoding\": \"text/html;charset=\'charset=u         


        
相关标签:
3条回答
  • 2020-12-09 19:20

    You can use iconv (lite) to convert this. You also need to tell request not to actively set the encoding to the default of UTF-8 by setting the encoding property to null. Therefore you should do:

    var iconv = require('iconv-lite');
    request.get({
        uri:'http://www.bold.dk/tv/',
        encoding: null
      },
      function(err, resp, body){    
        var bodyWithCorrectEncoding = iconv.decode(body, 'iso-8859-1');
        console.log(bodyWithCorrectEncoding);
      }
    );
    
    0 讨论(0)
  • 2020-12-09 19:28

    Maybe your trouble is in 'Accept-Encoding' header. Let's say you have Headers like 'Accept-Encoding': 'gzip,deflate'

    If it's so, you have 2 ways to fixing this:

    1. Remove this Header
    2. Use the following code to unzip the data:

      const req = request(options, res => {
          let buffers = []
          let bufferLength = 0
          let strings = []
      
          const getData = chunk => {
              if (!Buffer.isBuffer(chunk)) {
                  strings.push(chunk)
              } else if (chunk.length) {
                  bufferLength += chunk.length
                  buffers.push(chunk)
              }
          }
      
          const endData = () => {
              let response = {code: 200, body: ''}
              if (bufferLength) {
                  response.body = Buffer.concat(buffers, bufferLength)
                  if (options.encoding !== null) {
                      response.body = response.body.toString(options.encoding)
                  }
                  buffers = []
                  bufferLength = 0
              } else if (strings.length) {
                  if (options.encoding === 'utf8' && strings[0].length > 0 && strings[0][0] === '\uFEFF') {
                      strings[0] = strings[0].substring(1)
                  }
                  response.body = strings.join('')
              }
              console.log('response', response)
          };
      
          switch (res.headers['content-encoding']) {
              // or, just use zlib.createUnzip() to handle both cases
              case 'gzip':
                  res.pipe(zlib.createGunzip())
                      .on('data', getData)
                      .on('end', endData)
                  break;
              case 'deflate':
                  res.pipe(zlib.createInflate())
                      .on('data', getData)
                      .on('end', endData)
                  break;
              default:
                  res.pipe(zlib.createInflate())
                      .on('data', getData)
                      .on('end', endData)
                  break;
          }
      });
      
    0 讨论(0)
  • 2020-12-09 19:38

    I have the same problem, with request v2.88.0.

    Refer to woolfi makkinan's answer, I got a simple way to solve the problem.

    request.get({
        "uri": 'http://www.bold.dk/tv/',
        "encoding": "text/html;charset='charset=utf-8'",
        "gzip": true // notice this config
      },
      function(err, resp, body){    
        console.log(body);
      }
    );
    

    Add gzip: true to request options, request will deal with gzip, and then blob can convert to string correctly. ​

    0 讨论(0)
提交回复
热议问题