character-encoding

Is there any reason not to use UTF-8, 16, etc. for everything?

柔情痞子 提交于 2020-01-03 08:14:11
问题 I know the web is mostly standardizing towards UTF-8 lately and I was just wondering if there was any place where using UTF-8 would be a bad thing. I've heard the argument that UTF-8, 16, etc may use more space but in the end it has been negligible. Also, what about in Windows programs, Linux shell and things of that nature -- can you safely use UTF-8 there? 回答1: If UTF-32 is available, prefer that over the other versions for processing. If your platform supports UTF-32/UCS-4 Unicode natively

Is there any reason not to use UTF-8, 16, etc. for everything?

匆匆过客 提交于 2020-01-03 08:14:09
问题 I know the web is mostly standardizing towards UTF-8 lately and I was just wondering if there was any place where using UTF-8 would be a bad thing. I've heard the argument that UTF-8, 16, etc may use more space but in the end it has been negligible. Also, what about in Windows programs, Linux shell and things of that nature -- can you safely use UTF-8 there? 回答1: If UTF-32 is available, prefer that over the other versions for processing. If your platform supports UTF-32/UCS-4 Unicode natively

Encoding errors with StringIO and read_csv pandas

荒凉一梦 提交于 2020-01-03 05:29:08
问题 I am using an API to get some data. The data returned is in Unicode (not a dictionary / json object). get data data = [] for urls in api_call_list: data.append(requests.get(urls)) the data looks like this: >>> data[0].text u'Country;Celebrity;Song Volume;CPP;Index\r\nus;Taylor Swift;33100;0.83;0.20\r\n' >>> data[1].text u'Country;Celebrity;Song Volume;CPP;Index\r\nus;Rihanna;28100;0.76;0.33\r\n' I use this code to convert this to a dataframe: from io import StringIO import pandas as pd pd

Before Action on Import from CSV

老子叫甜甜 提交于 2020-01-03 05:00:07
问题 I have a simple CSV import where the provided file is already broken (UTF characters) (German). e.g.: The list has : G%C3%B6tterbote where as the right name should be Götterbote I'm trying to force the encoding when importing the CSV. My Import Action def import Player.import(params[:file]) redirect_to players_path, notice: "Players Imported successfully" end My Import Method def self.import(file) SmarterCSV.process(file.path) do |row| Player.create(row.first) end end I found out that this

Why table's index storage size is bigger after change charset from utf8mb4 to utf8?

房东的猫 提交于 2020-01-03 04:50:10
问题 Executed: alter table device_msg convert to character set 'utf8' COLLATE 'utf8_unicode_ci';" As my expect,table data size change to smaller. But at the same time, table index size change to bigger ? What happen and why ? ps: table data size and index size are calculated by information_schema.TABLES DbEngine: InnoDB Table Before: CREATE TABLE `device_msg` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `sn` varchar(30) COLLATE utf8_unicode_ci NOT NULL, `time` datetime(3) NOT NULL, `msg` json NOT

utf8_encode does not produce right result

我是研究僧i 提交于 2020-01-03 04:34:49
问题 My problem is the following: I store an array, which has keys like "e", "f", etc. At some point, I have to get the value of the key. This works well. But if I want to store "í", "é", etc. as the keys, it won't produce the right result (results in �). My page has to be in UTF-8. Looking up the problem, I found out that utf8_encode should help my problem. It didn't: although it produced a more-readable character, it still totally differed from what I want. If important, phpinfo gives: Directive

Can nginx re-encode XML documents, or alter XML headers?

耗尽温柔 提交于 2020-01-03 03:35:07
问题 I have a problem ultimately caused by a third party XML document whose actual encoding (ISO 8859-1 or Windows 1252, can't tell) doesn't match its declared encoding (UTF-8). I'm looking for creative workarounds. We already use nginx proxies for various content, so perhaps there is a way to either: Re-encode the document contents on the fly from ISO 8859-1 to UTF-8; or Alter the document header on the fly, from UTF-8 to ISO 8859-1. Are either of these possible with nginx? If not, a similar tool

spring boot MVC wrong encoded POST request

旧街凉风 提交于 2020-01-03 03:32:14
问题 I can not make request encoding to work correctly. For encoding to work, i added filter to spring security: @Bean public CharacterEncodingFilter characterEncodingFilter() { CharacterEncodingFilter filter = new CharacterEncodingFilter(); filter.setEncoding("UTF-8"); filter.setForceEncoding(true); return filter; } @Override protected void configure(HttpSecurity http) throws Exception { http.addFilterBefore(characterEncodingFilter(), CsrfFilter.class); ... } Add meta to my pages: <html xmlns=

Why is unicode included as an Encoding in Swift's String API?

大兔子大兔子 提交于 2020-01-03 03:17:06
问题 I read this very important blog regarding string encodings. After reading it I realized that unicode is a standard of mapping characters to code points which are integers. How these integers are stored in memory is an entirely different concept. This is where .utf8 , .utf16 come into play, defining the way we store these integers in memory. In the Swift String API there is a method which gives us the data bytes used to represent the String in various encodings: func data(using encoding:

vb.net serial port character encoding

做~自己de王妃 提交于 2020-01-03 03:03:49
问题 I'm working with VB.NET and I need to send and receive bytes of data over serial. This is all well and good until I need to send something like 173. I'm using a Subroutine to send a byte that takes an input as an integer and just converts it to a character to print it. Private Sub PrintByte(ByVal input As Integer) serialPort.Write(Chr(input)) End Sub If i try PrintByte(173) Or really anything above 127 it sends 63. I thought that was a bit odd, so i looked up the ASCII table and it appears 63