ASP.NET special character problem

痴心易碎 提交于 2019-12-14 04:22:26

问题


I'm building an automated RSS feed in ASP.NET and occurrences of apostrophes and hyphens are rendering very strangely:

"Here's a test" is rendering as "Here’s a test"

I have managed to circumvent a similar problem with the pound sign (£) by escaping the ampersand and building the HTML escape for £ manually as shown in in the extract below:

sArticleSummary = sArticleSummary.Replace("£", "£")

But the following attempt is failing to resolve the apostrophe issue, we stil get ’ on the screen.

sArticleSummary = sArticleSummary.Replace("’", "’"")

The string in the database (SQL2005) for all intents and purposes appears to be plain text - can anyone advise why what seem to be plain text strings keep coming out in this manner, and if anyone has any ideas as to how to resolve the apostrophe issue that'd be appreciated.

Thanks for your help.

[EDIT]

Further to Vladimir's help, it now looks as though the problem is that somewhere between the database and it being loaded into the string var the data is converting from an apostrophe to ’ - has anyone seen this happen before or have any pointers?

Thanks


回答1:


I would guess the the column in your SQL 2005 database is defined as a varchar(N), char(N) or text. If so the conversion is due to the database driver using a different code page setting to that set in the database.

I would recommend changing this column (any any others that may contain non-ASCII data) to nvarchar(N), nchar(N) or nvarchar(max) respectively, which can then contain any Unicode code point, not just those defined by the code page.

All of my databases now use nvarchar/nchar exclusively to avoid these type of encoding issues. The Unicode fields use twice as much storage space but there'll be very little performance difference if you use this technique (the SQL engine uses Unicode internally).




回答2:


Transpires that the data (whilst showing in SQLServer plain) is actually carrying some MS Word special characters.




回答3:


Assuming you get Unicode-characters from the database, the easiest way is to let System.Xml.dll take care of the conversion for you by appending the RSS-feed with a XmlDocument object. (I'm not sure about the elements found in a rss-feed.)

        XmlDocument rss = new XmlDocument();
        rss.LoadXml("<?xml version='1.0'?><rss />");
        XmlElement element = rss.DocumentElement.AppendChild(rss.CreateElement("item")) as XmlElement;
        element.InnerText = sArticleSummary;

or with Linq.Xml:

        XDocument rss = new XDocument(
            new XElement("rss",
                new XElement("item", sArticleSummary)
            )
        );



回答4:


I would just put "Here's a test" into a CDATA tag. Easy and it works.

<![CDATA[Here's a test]]>


来源:https://stackoverflow.com/questions/1686391/asp-net-special-character-problem

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!