utf-8

File.listFiles() crashes for invalid UTF-8 characters

爱⌒轻易说出口 提交于 2020-02-24 05:07:25
问题 Some folder in my phone storage includes files like this: dzG럫saᡑῑ.sg 존Ὣ 졼).sg So when I try to read files from this folder with File.listFiles() function my app crashes: JNI DETECTED ERROR IN APPLICATION: input is not valid Modified UTF-8: illegal start byte ...... string: 'dzG럫saᡑῑ.sg' I found out which app creates them, but it doesn't matter, for example if other users would have similar files on their phone memory, I can't just ask them to remove it I just want to avoid app crashing Even

Gmail API not respecting UTF encoding in subject

二次信任 提交于 2020-02-23 12:24:37
问题 In an app I'm helping develop we've added in the ability for a user to invite other users and personalize the invitation email, and then send it via Gmail's APIs. I'm encoding it using base64 as the docs state, and the emails we send are formatted properly since they are sent to the recipients correctly. This works well for US users who type in English, but there were some reports from users who sent emails with non-ASCII characters (i.e. in Hebrew) having their emails garbled when sent. I

keep only alphanumeric characters and space in a string using gsub

烈酒焚心 提交于 2020-02-23 09:15:42
问题 I have a string which has alphanumeric characters, special characters and non UTF-8 characters. I want to strip the special and non utf-8 characters. Here's what I've tried: gsub('[^0-9a-z\\s]','',"�+ Sample string here =�{�>E�BH�P<]�{�>") However, This removes the special characters (punctuations + non utf8) but the output has no spaces. gsub('/[^0-9a-z\\s]/i','',"�+ Sample string here =�{�>E�BH�P<]�{�>") The result has spaces but there are still non utf8

read.csv() with UTF-8 encoding [duplicate]

我们两清 提交于 2020-02-20 08:01:05
问题 This question already has answers here : Cannot read unicode .csv into R (3 answers) Closed 2 years ago . I am trying to read in data from a csv file and specify the encoding of the characters to be UTF-8. From reading through the ?read.csv() instructions, it seems that fileEncoding set equal to UTF-8 should accomplish this, however, I am not seeing that when checking. Is there a better way to specify the encoding of character strings to be UTF-8 when importing the data? Sample Data: Download

Unicode character Visual C++

≯℡__Kan透↙ 提交于 2020-02-15 06:40:47
问题 I'm trying to make my program work with unicode characters. I'm using Visual Studio 2010 on a Windows 7 x32 machine. What I want to print is the queen symbol ("\ul2655") and it just doesn't work. I've set my solution to use unicode. This is my sample code: #include <iostream> using namespace std; int main() { SetConsoleOutputCP(CP_UTF8); wcout << L"\u2655"; return 0; } Also, I've tried many other suggestions, but nothing worked. (eg. change the cmd font, apply chcp 65001, which is the same as

Unicode character Visual C++

ぃ、小莉子 提交于 2020-02-15 06:39:39
问题 I'm trying to make my program work with unicode characters. I'm using Visual Studio 2010 on a Windows 7 x32 machine. What I want to print is the queen symbol ("\ul2655") and it just doesn't work. I've set my solution to use unicode. This is my sample code: #include <iostream> using namespace std; int main() { SetConsoleOutputCP(CP_UTF8); wcout << L"\u2655"; return 0; } Also, I've tried many other suggestions, but nothing worked. (eg. change the cmd font, apply chcp 65001, which is the same as

Java equivalent to JavaScript's encodeURIComponent that produces identical output?

北城余情 提交于 2020-02-08 21:42:48
问题 I've been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and "exotic" Unicode characters and produce output that's identical to JavaScript's encodeURIComponent function. My torture test string is: "A" B ± " If I enter the following JavaScript statement in Firebug: encodeURIComponent('"A" B ± "'); —Then I get: "%22A%22%20B%20%C2%B1%20%22" Here's my little test Java program: import java.io

Java equivalent to JavaScript's encodeURIComponent that produces identical output?

帅比萌擦擦* 提交于 2020-02-08 21:41:46
问题 I've been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and "exotic" Unicode characters and produce output that's identical to JavaScript's encodeURIComponent function. My torture test string is: "A" B ± " If I enter the following JavaScript statement in Firebug: encodeURIComponent('"A" B ± "'); —Then I get: "%22A%22%20B%20%C2%B1%20%22" Here's my little test Java program: import java.io

Java equivalent to JavaScript's encodeURIComponent that produces identical output?

纵然是瞬间 提交于 2020-02-08 21:40:46
问题 I've been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and "exotic" Unicode characters and produce output that's identical to JavaScript's encodeURIComponent function. My torture test string is: "A" B ± " If I enter the following JavaScript statement in Firebug: encodeURIComponent('"A" B ± "'); —Then I get: "%22A%22%20B%20%C2%B1%20%22" Here's my little test Java program: import java.io

Java equivalent to JavaScript's encodeURIComponent that produces identical output?

喜你入骨 提交于 2020-02-08 21:39:54
问题 I've been experimenting with various bits of Java code trying to come up with something that will encode a string containing quotes, spaces and "exotic" Unicode characters and produce output that's identical to JavaScript's encodeURIComponent function. My torture test string is: "A" B ± " If I enter the following JavaScript statement in Firebug: encodeURIComponent('"A" B ± "'); —Then I get: "%22A%22%20B%20%C2%B1%20%22" Here's my little test Java program: import java.io