windows-1255

How to import from a mixed-encoding file to a PostgreSQL table

让人想犯罪 __ 提交于 2019-12-12 01:53:02
问题 I have a 30 GB text file. the encoding of the file is UTF8 but it also contains some Windows-1252 characters. So, when I try to import, it gives the following error: ERROR: invalid byte sequence for encoding "UTF8": 0x9b How can I fix this? the file already has UTF8 format, when i run the 'file' command for this file it says the encoding is UTF8. but it also contains some not UTF8 byte sequences. for example when I run the \copy command after a while it gives the above mentioned error for

Converting from Windows-1255 to UTF-8 in Node JS

不问归期 提交于 2019-12-11 10:27:11
问题 I'm extracting text from a Windows-1255-encoded webpage using Node.js. I'm trying to decode the text using the windows-1255. After installing it using NPM and requiring it in the relevant file, I tried using it like this: var title = windows1255.decode('#title').text()); This doesn't seem to have any effect. Any idea why? Thanks! Morgan 回答1: don't know if you still waiting for an answer about this issue, but the following worked for me... When fetching the data (a file), I set the get options

How to convert Windows-1255 To UTF-8 in Classic ASP?

与世无争的帅哥 提交于 2019-12-11 01:55:59
问题 How can I convert a windows-1255 string to utf-8 in classic ASP? My database is windows-1255 and I want to transfer my site to utf-8. 回答1: Does the code in this answer do what you need? 回答2: Are you sure you need to do any conversion. Whilst your database may store the string in a particular encoding, ordinarily ADODB/OLEDB will deliver the string to VBScript/JScript running in a ASP page as unicode (since actually the script languages only support unicode its actually possible to have any

Why does Perl's LWP gives me a different encoding than the original website?

风格不统一 提交于 2019-12-07 16:08:38
问题 Lets say i have this code: use strict; use LWP qw ( get ); my $content = get ( "http://www.msn.co.il" ); print STDERR $content; The error log shows something like "\xd7\x9c\xd7\x94\xd7\x93\xd7\xa4\xd7\xa1\xd7\x94" which i'm guessing it's utf-16 ? The website's encoding is with <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1255"> so why these characters appear and not the windows-1255 chars ? And, another weird thing is that i have two servers: the first server returning

Character encoding issue when using Google Apps Script to extract data from web page

回眸只為那壹抹淺笑 提交于 2019-12-07 12:32:21
问题 I have written a script using Google Apps Script to extract text from a web page into Google Sheets. I only need this script to work with a specific web page, so it does not need to be versatile. The script works almost exactly as I want it to except that I have run into a character encoding problem. I am extracting both Hebrew and English text. The meta tag in the HTML has charset=Windows-1255. The English extracts perfectly, but the Hebrew displays as black diamonds containing a question

Why does Perl's LWP gives me a different encoding than the original website?

廉价感情. 提交于 2019-12-05 20:01:56
Lets say i have this code: use strict; use LWP qw ( get ); my $content = get ( "http://www.msn.co.il" ); print STDERR $content; The error log shows something like "\xd7\x9c\xd7\x94\xd7\x93\xd7\xa4\xd7\xa1\xd7\x94" which i'm guessing it's utf-16 ? The website's encoding is with <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1255"> so why these characters appear and not the windows-1255 chars ? And, another weird thing is that i have two servers: the first server returning CP1255 chars and i can simply convert it to utf8, and the current server gives me these chars and i can

Encoding issues … windows-1255 to utf 8?

孤者浪人 提交于 2019-12-01 14:40:56
Encoding convert from windows-1255 to utf-8 was asked before I know, but I'm still getting different results and I can't solve it. The first issue is "does php iconv() or mb_convert_encoding() support windows-1255????" While testing, it returns several outputs (playing with the //ignore & //translate) but its not working well at all. I was looking at mb_list_encodings() output and it doesn't include window-1255... playing and testing mb_detect_encoding() with an windows-1255 input (crawled from the net) doesn't return the good charset... You should be able to just use strtr with an associative

Encoding issues … windows-1255 to utf 8?

可紊 提交于 2019-12-01 12:24:07
问题 Encoding convert from windows-1255 to utf-8 was asked before I know, but I'm still getting different results and I can't solve it. The first issue is "does php iconv() or mb_convert_encoding() support windows-1255????" While testing, it returns several outputs (playing with the //ignore & //translate) but its not working well at all. I was looking at mb_list_encodings() output and it doesn't include window-1255... playing and testing mb_detect_encoding() with an windows-1255 input (crawled