问题:

I'm trying to load a UTF8 json file from disk using node.js (0.10.29) on Windows 8.1. The following is the code that runs:

var http = require('http'); var utils = require('util'); var path = require('path'); var fs = require('fs');  var myconfig; fs.readFile('./myconfig.json', 'utf8', function (err, data) {     if (err) {         console.log("ERROR: Configuration load - " + err);         throw err;     } else {         try {             myconfig = JSON.parse(data);             console.log("Configuration loaded successfully");         }         catch (ex) {             console.log("ERROR: Configuration parse - " + err);         }       } });

I get the following error when I run this:

Now, when I change the file encoding (using Notepad++) to ANSI, it works without a problem.

Any ideas why this is the case? Whilst development is being done on Windows the final solution will be deployed to a variety of non-Windows servers, I'm worried that I'll run into issues on the server end if I deploy an ANSI file to Linux, for example.

According to my searches here and via Google the code should work on Windows as I am specifically telling it to expect a UTF-8 file.

Sample config I am reading:

{     "ListenIP4": "10.10.1.1",     "ListenPort": 8080 }

回答1:

Per "fs.readFileSync(filename, 'utf8') doesn't strip BOM markers #1918", fs.readFile is working as designed: BOM is not stripped from the header of the UTF-8 file, if it exists. It at the discretion of the developer to handle this.

Possible workarounds:

data = data.replace(/^\uFEFF/, ''); per https://github.com/joyent/node/issues/1918#issuecomment-2480359
Transform the incoming stream to remove the BOM header with the NPM module bomstrip per https://github.com/joyent/node/issues/1918#issuecomment-38491548

What you are getting is the byte order mark header (BOM) of the UTF-8 file. When JSON.parse sees this, it gives an syntax error (read: "unexpected character" error). You must strip the byte order mark from the file before passing it to JSON.parse:

fs.readFile('./myconfig.json', 'utf8', function (err, data) {     myconfig = JSON.parse(data.toString('utf8').replace(/^\uFEFF/, '')); }); // note: data is an instance of Buffer

回答2:

To get this to work without I had to change the encoding from "UTF-8" to "UTF-8 without BOM" using Notepad++ (I assume any decent text editor - not Notepad - has the ability to choose this encoding type).

This solution meant that the deployment guys could deploy to Unix without a hassle, and I could develop without errors during the reading of the file.

In terms of reading the file, the other response I sometimes got in my travels was a question mark appended before the start of the file contents, when trying various encoding options. Naturally with a question mark or ANSI characters appended the JSON.parse fails.

Hope this helps someone!

转载请标明出处:node.js readfile error with utf8 encoded file on windows

文章来源: node.js readfile error with utf8 encoded file on windows

标签

readfile

utf8

encode