encoding

Hex characters in varchar() is actually ascii. Need to decode it

Deadly 提交于 2020-01-03 11:46:50
问题 This is such an edge-case of a question, I'd be surprised if there is an easy way to do this. I have a MS SQL DB with a field of type varchar(255). It contains a hex string which is actually a Guid when you decode it using an ascii decoder. I know that sounds REALLY weird but here's an example: The contents of the field: "38353334373838622D393030302D343732392D383436622D383161336634396339663931" What it actually represents: "8534788b-9000-4729-846b-81a3f49c9f91" I need a way to decode this,

How to encode Java files in UTF-8 using Apache Ant?

不羁岁月 提交于 2020-01-03 10:56:12
问题 In my build.xml file I fetch some Java files by cxf. Some of these Java files need to be encoded in UTF-8. How can I use Ant to change the encoding to UTF-8? PS: I found instructions for how to set the encoding for javac to UTF-8, but prior to javac I need Java files to be in UTF-8. Otherwise I get an error: warning: unmappable character for encoding utf-8 Here is my code: <macrodef name="lpwservice"> <attribute name="name"/> <attribute name="package"/> <sequential> <property name="wsdlfile"

DBF - encoding cp1250

扶醉桌前 提交于 2020-01-03 10:23:12
问题 I have dbf database encoded in cp1250 and I am reading this database using folowing code: import csv from dbfpy import dbf import os import sys filename = sys.argv[1] if filename.endswith('.dbf'): print "Converting %s to csv" % filename csv_fn = filename[:-4]+ ".csv" with open(csv_fn,'wb') as csvfile: in_db = dbf.Dbf(filename) out_csv = csv.writer(csvfile) names = [] for field in in_db.header.fields: names.append(field.name) #out_csv.writerow(names) for rec in in_db: out_csv.writerow(rec

UTF-8 encode URLs

帅比萌擦擦* 提交于 2020-01-03 09:20:31
问题 Info: I've a program which generates XML sitemaps for Google Webmaster Tools (among other things). GWTs is giving me errors for some sitemaps because the URLs contain character sequences like ã¾, ã‹, ã€, etc. ** GWTs says: We require your Sitemap file to be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters: & , ' , " , < , > . The special characters are excaped in the XML

Hash keys encoding: Why do I get here with Devel::Peek::Dump two different results?

别说谁变了你拦得住时间么 提交于 2020-01-03 07:08:11
问题 Why do I get here with Devel::Peek::Dump two different results? #!/usr/bin/env perl use warnings; use 5.014; use utf8; binmode STDOUT, ':encoding(utf-8)'; use Devel::Peek; my %hash1 = ( 'müller' => 1 ); say Dump $_ for keys %hash1; my %hash2; $hash2{'müller'} = 1; say Dump $_ for keys %hash2; Output: SV = PV(0x753270) at 0x76d230 REFCNT = 2 FLAGS = (POK,pPOK,UTF8) PV = 0x759750 "m\303\274ller"\0 [UTF8 "m\x{fc}ller"] CUR = 7 LEN = 8 SV = PV(0x753270) at 0x7d75a8 REFCNT = 2 FLAGS = (POK,FAKE

Check if UTF-8 character requires maximum three bytes

拜拜、爱过 提交于 2020-01-03 06:32:28
问题 I need to save a user input to database to column with utf8_general_ci encoding which requires maximum three bytes per code point. But if the user input contains characters which uses four bytes (for example emojis), the input is not saved into column. What I need is to check the input to only contain characters that uses maximum three bytes. I know I can just change column encoding to utf8mb4 but I don't want to do it. So how can I do something like this: if (maxThreeBytes("😄")) { //return

Efficient small byte-arrays in C#

只愿长相守 提交于 2020-01-03 05:46:04
问题 I have a huge collection of very small objects. To ensure the data is stored very compactly I rewrote the class to store all information within a byte-array with variable-byte encoding. Most instances of these millions of objects need only 3 to 7 bytes to store all the data . After memory-profiling I found out that these byte-arrays always take at least 32 bytes . Is there a way to store the information more compactly than bit-fiddled into a byte[]? Would it be better to point to an unmanaged

How to correctly get Unicode text input from QPlainTextEdit? [duplicate]

半城伤御伤魂 提交于 2020-01-03 04:54:10
问题 This question already has answers here : UnicodeEncodeError: 'charmap' codec can't encode characters (6 answers) Python 'ascii' codec can't encode character with request.get (1 answer) Closed last year . Just running the application I got the correct results on the QPlainTextEdit area on the screen: But when clicking on the button Start Simulation and recovering the input from it with QPlainTextEdit.toPlainText() , the output goes invalid: def handle_first_input_text(self): textEdit = self

XML encoding issue

爱⌒轻易说出口 提交于 2020-01-03 04:16:08
问题 I want to know whether there is quick way to find whether an XML document is correctly encoded in UTF-8 and does not contains any characters which is not allowed in XML UTF-8 encoding. <?xml version="1.0" encoding="utf-8"?> thanks in advance, George EDIT1: here is the content of my XML file, in both text form and in binary form. http://tinypic.com/view.php?pic=2r2akvr&s=5 I have tried to use tools like xmlstarlet to check, the result is correct (invalid because of out of range of UTF-8), but

What's the behavior the browser encoding URL?

和自甴很熟 提交于 2020-01-03 03:30:33
问题 I'm doing a test, how the Firefox encoding character. But the fact confused me. HTML code: <html lang="zh_CN"> <head> <title>some Chinese character</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </head> <body> <img src="http://localhost/xxx" /> </body> The xxx is some Chinese characters. These character must be encode into format like %xx to transport by HTTP. First, I encoding the source file in UTF-8. use firefox to open the html file. The img label will send a