character-encoding | 易学教程

problems reading correctly a csv due to UnicodeDecodeError in Python3

阅读更多关于 problems reading correctly a csv due to UnicodeDecodeError in Python3

问题 I create a csv file in wich I put some lyrics of songs, using this: with io.open('songs.csv', 'a+',encoding='utf-8') as file: writer = csv.writer(file , dialect='excel') writer.writerow(input_row) the csv ( opened with excel) is quite strange - I don't know how to upload files here so please sorry for the pic. - As you can see, the delimiters for the csv are commas, (the columns should be Artist, Album, Title, Lyric ) I noticed that I had some spanish and italian lyrics, and characters like

PostgreSQL psycopg2 Python3.7.4 UnicodeDecodeError: 'ascii' codec can't decode byte

阅读更多关于 PostgreSQL psycopg2 Python3.7.4 UnicodeDecodeError: 'ascii' codec can't decode byte

问题 I'm trying to query from a PostgreSQL database with ANSI drivers but for some queries it fails, giving me the following error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xfd in position 10: ordinal not in range(128) Here is the function to set the connection and query: import psycopg2 import pandas as pd def query_cdk_database(query): conn = psycopg2.connect(host="some_host", port = xxx, database="xxx", user="xxxx", password="xxx", client_encoding ='auto') cur = conn.cursor() cur

PostgreSQL psycopg2 Python3.7.4 UnicodeDecodeError: 'ascii' codec can't decode byte

阅读更多关于 PostgreSQL psycopg2 Python3.7.4 UnicodeDecodeError: 'ascii' codec can't decode byte

What is CharsetDecoder.decode(ByteBuffer, CharBuffer, endOfInput)

阅读更多关于 What is CharsetDecoder.decode(ByteBuffer, CharBuffer, endOfInput)

问题 I have a problem with CharsetDecoder class. First example of code (which works): final CharsetDecoder dec = Charset.forName("UTF-8").newDecoder(); final ByteBuffer b = ByteBuffer.allocate(3); final byte[] tab = new byte[]{(byte)-30, (byte)-126, (byte)-84}; //char € for (int i=0; i<tab.length; i++){ b.put(tab, i, 1); } try { b.flip(); System.out.println("a" + dec.decode(b).toString() + "a"); } catch (CharacterCodingException e1) { e1.printStackTrace(); } The result is a€a But when i execute

Laravel localization to German and special letters

阅读更多关于 Laravel localization to German and special letters

问题 I have a problem with some German's special letters (ö, ü...) in my Laravel application. My encoding is set to UTF-8. Everything works fine with the content from the database (where is utf8_general_ci ). When I hardcode some text to Blade view files, that's fine, too. But, I'm using localization files ( /app/lang/de/myFile.php ) with an associative array. German characters from that array are displayed as � � �. What is strange, when I var_dump(trans('myFile.key')) in Blade, special

Laravel localization to German and special letters

阅读更多关于 Laravel localization to German and special letters

Java - Count exactly 60 characters from a string with a mixture of UTF-8 and non UTF-8 characters

阅读更多关于 Java - Count exactly 60 characters from a string with a mixture of UTF-8 and non UTF-8 characters

问题 I have a string which i want to save in a database that only supports UTF8 characters. If the string size is > 60 characters i want to truncate it and only store the first 60 characters. The Oracle database in use only supports UTF-8 characters. Using String.substring(0,59) in Java returns 60 characters but when i save it in the database it gets rejected as the database claims that the string is > 60 characters. Is there a way to find out if a particular string contains non UTF8 characters.

How to get the Unicode code point for a character in Javascript?

阅读更多关于 How to get the Unicode code point for a character in Javascript?

问题 I'm using a barcode scanner to read a barcode on my website (the website is made in OpenUI5). The scanner works like a keyboard that types the characters it reads. At the end and the beginning of the typing it uses a special character. These characters are different for every type of scanner. Some possible characters are: █ ▄ – — In my code I use if (oModelScanner.oData.scanning && oEvent.key == "\u2584") to check if the input from the scanner is ▄. Is there any way to get the code from that

Converting String from One Charset to Another

阅读更多关于 Converting String from One Charset to Another

问题 I am working on converting a string from one charset to another and read many example on it and finally found below code, which looks nice to me and as a newbie to Charset Encoding, I want to know, if it is the right way to do it . public static byte[] transcodeField(byte[] source, Charset from, Charset to) { return new String(source, from).getBytes(to); } To convert String from ASCII to EBCDIC , I have to do: System.out.println(new String(transcodeField(ebytes, Charset.forName("US-ASCII"),

Remove all hexadecimal characters before loading string into XML Document Object?

阅读更多关于 Remove all hexadecimal characters before loading string into XML Document Object?

问题 I have an xml string that is being posted to an ashx handler on the server. The xml string is built on the client-side and is based on a few different entries made on a form. Occasionally some users will copy and paste from other sources into the web form. When I try to load the xml string into an XMLDocument object using xmldoc.LoadXml(xmlStr) I get the following exception: System.Xml.XmlException = {"'', hexadecimal value 0x0B, is an invalid character. Line 2, position 1."} In debug mode I