encoding

Convert TXT File of Unknown Encoding to String

女生的网名这么多〃 提交于 2020-01-11 09:18:47
问题 How can I convert Plain Text (.txt) files to a string if the encoding type is unknown? I'm working on a feature that would allow users to import txt files into my app. This means the file could have been created in any number of apps, utilizing any of a variety of encodings that would be considered valid for a plain text file. My understanding is this could include (ASCII, UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, or EBCDIC?!) Things had been going well using the

How to save pdf in proper encoding via nodejs

a 夏天 提交于 2020-01-11 06:28:06
问题 So I'm trying to download a pdf file from a website with my script but the problem is that the file gets broken in the process and I'm pretty sure it's because of wrong encoding being used. I'm using request lib for downloading the file and I've set the Content-type to application-pdf My code is pretty simple:4 var fs = require('fs'); var request = require("request"); request({uri: 'xxxxxxxxxxxxxx.pdf', headers: { 'Content-type' : 'applcation/pdf' }} , function (error, response, body) { if (

Encoding conversion of a fetch response

僤鯓⒐⒋嵵緔 提交于 2020-01-11 05:58:08
问题 Inside a React Native method I'm fetching a xml encoded in ISO-8859-1. As long as the fetching is completed I'm trying to convert it to UTF-8. Here the code: const iconv = require('iconv-lite'); fetch('http://www.band.uol.com.br/rss/colunista_64.xml', { headers: { "Content-type": "text/xml; charset=ISO-8859-1" } }) .then(res=>res.text()}) .then(text => { const decodedText = iconv.decode(Buffer.from(text, 'latin1'), 'latin1') , output = iconv.encode(decodedText, 'utf8') console.log(output

URL to URI encoding changes a “%3D” to “%253D”

浪子不回头ぞ 提交于 2020-01-10 18:17:15
问题 I'm having trouble encoding a URL to a URI: mUrl = "A string url that needs to be encoded for use in a new HttpGet()"; URL url = new URL(mUrl); URI uri = new URI(url.getProtocol(), url.getAuthority(), url.getPath(), url.getQuery(), null); This does not do what I expect for the following URL: Passing in the String: http://m.bloomingdales.com/img?url=http%3A%2F%2Fimages.bloomingdales.com%2Fis%2Fimage%2FBLM%2Fproducts%2F3%2Foptimized%2F1140443_fpx.tif%3Fwid%3D52%26qlt%3D90%2C0%26layer%3Dcomp

Nodejs: convert string to buffer

痴心易碎 提交于 2020-01-10 17:53:07
问题 I'm trying to write a string to a socket (socket is called "response"). Here is the code I have sofar (I'm trying to implement a byte caching proxy...): var http = require('http'); var sys=require('sys'); var localHash={}; http.createServer(function(request, response) { var proxy = http.createClient(80, request.headers['host']) var proxy_request = proxy.request(request.method, request.url, request.headers); proxy_request.addListener('response', function (proxy_response) { proxy_response

Nodejs: convert string to buffer

守給你的承諾、 提交于 2020-01-10 17:52:53
问题 I'm trying to write a string to a socket (socket is called "response"). Here is the code I have sofar (I'm trying to implement a byte caching proxy...): var http = require('http'); var sys=require('sys'); var localHash={}; http.createServer(function(request, response) { var proxy = http.createClient(80, request.headers['host']) var proxy_request = proxy.request(request.method, request.url, request.headers); proxy_request.addListener('response', function (proxy_response) { proxy_response

c# Detect xml encoding from Byte Array?

一笑奈何 提交于 2020-01-10 01:59:12
问题 Well i have a byte array, and i know its a xml serilized object in the byte array is there any way to get the encoding from it? Im not going to deserilize it but im saving it in a xml field on a sql server... so i need to convert it to a string? 回答1: You could look at the first 40-ish bytes 1 . They should contain the document declaration (assuming it has an document declaration) which should either contain the encoding or you can assume it's UTF-8 or UTF-16, which should should be obvious

How to remove non-printable/invisible characters in ruby?

时间秒杀一切 提交于 2020-01-10 01:58:12
问题 Sometimes I have evil non-printable characters in the middle of a string. These strings are user input, so I must make my program receive it well instead of try to change the source of the problem. For example, they can have zero width no-break space in the middle of the string. For example, while parsing a .po file, one problematic part was the string "he is a man of god" in the middle of the file. While it everything seems correct, inspecting it with irb shows: "he is a man of god"

invalid byte 2 of 2-byte UTF-8 sequence

别来无恙 提交于 2020-01-09 19:17:31
问题 I am trying to parse an XML file with <?version = 1.0, encoding = UTF-8> but ran into an error message invalid byte 2 of 2-byte UTF-8 sequence . Does anybody know what caused this problem? 回答1: Most commonly it's due to feeding ISO-8859-x (Latin-x, like Latin-1) but parser thinking it is getting UTF-8 . Certain sequences of Latin-1 characters (two consecutive characters with accents or umlauts) form something that is invalid as UTF-8 , and specifically such that based on first byte, second

PowerShell out-file: prevent encoding changes

我们两清 提交于 2020-01-09 13:09:30
问题 I'm currently working on some search and replace operation that I'm trying to automate using powershell. Unfortunately I recognized yesterday that we've different file encodings in our codebase (UTF8 and ASCII). Because we're doing these search and replace operations in a different branch I can't change the file encodings at this stage. If I'm running the following lines it changes all files to UCS-2 Little Eindian even though my default powershell encoding is set to iso-8859-1 (Western