How does “cut and paste” affect character encoding and what can go wrong?

旧巷老猫 提交于 2019-12-17 15:49:09

问题


I have a document A in encoding A displayed in tool A and a document B in encoding B displayed in tool B. If I cut and paste (part of) B into A what might be the resultant character encoding? I realise this depends on tool A and tool B and the information held in the paste buffer (which presumably can contain an encoding?) and the operating system.

What should high-quality tools do? and in practice how many of the common tools (e.g. Word, TextPad, various IDEs, etc.) do a good job?


回答1:


First of all, a text editor's internal representation of text has no bearing on how the text is encoded (serialized) when you save the file. So a document is not "in" an encoding; it's a sequence of abstract characters. When the document is saved to a file (or transmitted over the network) then it gets encoded.

It's up to each application to decide what it puts on the clipboard. Typically, a windows app that knows what it's doing will put a number of different representations on the clipboard. When you paste in the other app, the app will look for the representation that best suits its need.

In your case, a text editor (that knows what it's doing) will put a Unicode representation of a selected string onto the clipboard (where Unicode, in Windows, is typically moved around as UTF-16, but that's not important). When you paste in the other app, it will insert that sequence of Unicode characters into the document at the selection point.

There's an app floating around called "ClipSpy" that will help you see what I'm talking about, interactively.



来源:https://stackoverflow.com/questions/1929812/how-does-cut-and-paste-affect-character-encoding-and-what-can-go-wrong

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!