html-entities

Jsoup having problems with special HTML symbols, ‘ — etc

喜夏-厌秋 提交于 2019-12-04 07:53:56
I have some HTML (String) that I am putting through Jsoup just so I can add something to all href and src attributes, that works fine. However, I'm noticing that for some special HTML characters, Jsoup is converting them from say “ to the actual character “ . I output the value before and after and I see that change. Before: THIS — IS A “TEST”. 5 > 4. trademark: ™ After: THIS — IS A “TEST”. 5 > 4. trademark: ? What the heck is going on? I was specifically converting those special characters to their HTML entities before any Jsoup stuff to avoid this. The quotes changed to the actual quote

Unicode characters or encoded entities [duplicate]

别等时光非礼了梦想. 提交于 2019-12-04 06:48:45
问题 This question already has answers here : HTML and character encoding vs HTML Entity (3 answers) Closed 5 years ago . I'm using some special characters like × ( × ) or … ( … ) in my html pages. Somewhere I'm using unicode character directly, but somewhere I'm using encoded entity like &hellip . I want to tidy up my code and can't decide what notation is better. I could find just two pros and cons: using character directly I can set text in javascript using text method like $("#button").text(

Detect whether HTML element contains a specific character entity

走远了吗. 提交于 2019-12-04 06:41:08
问题 If I have markup like this: <div id="foo"></div> and I want to detect later whether div#foo still contains that same character entity, I'd like to be able to do so by comparing it to  rather than to  (which in my code base is rather obtuse for maintenance purposes). I've tried things like this (using jQuery): console.log($('<textarea />').html($('#foo').html()).val()); But that seems to still output the nice little square "what you talkin' 'bout" character rather than the desired  . I'm

Is it okay to use HTML entities in attributes?

不打扰是莪最后的温柔 提交于 2019-12-04 06:33:08
I have been using slim, and suddenly noticed that it escapes everything by default. So the anchor tag looks something like this: <a href="/users/lyann/followers"> <img class="user-image" src="http://adasdasdasd.cloudfront.net/users&# 47;2011/05/24/4asdasd/asdasd.jpg" /> Is it okay for the href and src attributes to be escaped like this? Are there any other implications? All browsers seems to render it without a problem, though. Yes, it's perfectly fine. Character references are valid inside attributes, too, and will be treated as character references just the same. For reference, see: A

Why HTML decimal and HTML hex?

倖福魔咒の 提交于 2019-12-04 05:39:23
I have tried to Google quite a while now for an answer why HTML entities can be compiled either in HTML decimal or HTML hex. So my questions are: What is the difference between HTML decimal and HTML hex? Why are there two systems to do the same thing? Originally, HTML was nominally based on SGML, which has decimal character references only. Later, the hexadecimal alternative was added in HTML 4.01 (and soon implemented in browsers), then retrofitted into SGML in the Web Adaptations Annex . The apparent main reason for adding the hexadecimal alternative was that all modern character code and

Convert special chars to HTML entities, without changing tags and parameters

隐身守侯 提交于 2019-12-04 02:33:19
问题 I'm using FreeTextBox editor to get some HTML created by users. The problem with this is this editor is not converting special chars in HTML entities at exception of "<>". I cannot use theHTML = Server.HtmlEncode(theHTML) , because it converts all the HTML including tags and parameters, and I don't want to create an unfinishable list of theHTML.Replace lines. Is there any other function or method available to convert to html entities but only outside tags? 回答1: If you've got a mixture of <

Sanitizing PHP/SQL $_POST, $_GET, etc…?

半腔热情 提交于 2019-12-04 01:52:40
问题 Ok, this subject is a hotbed I understand that. I also understand that this situation is dependent on what you are using as code. I have three situations that need to be resolved. I have a form in where we need to allow people to make comments and statements that use commas, tildes, etc... but still remain safe from attacks. I have people entering in dates like this: 10/13/11 mm/dd/yy in English, can this be sanitized? How do I understand how to use htmlspecialchars() , htmlentities() and

Rspec testing for html entities in page content

人走茶凉 提交于 2019-12-04 00:59:42
I'm writing a request spec and would like to test for the presence of the string "Reports » Aging Reports". I get an error (invalid multibyte char) if I put in the character in my matcher expression directly, so I tried this: page.should have_content("Reports » Aging Reports") This fails the test with the message: expected there to be content "Reports » Aging Reports" in "\n Reports » Aging Reports\n I've tried things like .html_safe with no success. Is there a way to test for text containing html entities? Edit: Here's the relevant area of the html source: <a href="/reports">Reports</a> » <a

Encoding issue, coverting & to & for html using php

那年仲夏 提交于 2019-12-03 23:52:59
I have a url in html: <a href="index.php?q=event&id=56&date=128"> I need to turn it into a string exactly as: <a href="index.php?q=event&id=56&date=128"> I know how to do this with preg_replace etc, but is there a function in php that deals directly with encoding that I can use for other encoding issues such as &nsbp (or whatever it is, etc)? Ideally I would send my string into the function and it would output '&' instead of &amp. Is there a universal function for converting &TEXT; into an actual character? Edit: sorry, posted this before I finished typing the question. QUESTION is now

Why doesn't Twitter and Google API documentation encode ampersands in URLs?

最后都变了- 提交于 2019-12-03 23:36:33
I have read I should encode my ampersands as & in HTML. However numerous code samples from respected companies somehow forget to do this . Just a few examples off the top of my head: Google Web Fonts sample code: <link href='http://fonts.googleapis.com/css?family=PT+Sans&subset=latin,cyrillic' rel='stylesheet' type='text/css'> Google Maps documentation: <script type="text/javascript" src="http://maps.googleapis.com/maps/api/js?sensor=false&language=ja"> Twitter Anywhere official tutorial: <script src="http://platform.twitter.com/anywhere.js?id=YOUR_API_KEY&v=1" type="text/javascript"></script>