html-entities

Parsing XML With Single Quotes?

元气小坏坏 提交于 2019-12-25 00:27:49
问题 I am currently running into a problem where an element is coming back from my xml file with a single quote in it. This is causing xml_parse to break it up into multiple chunks, example: Get Wired, You're Hired! Is then enterpreted as 'Get Wired, You' being one object, the single quote being a second, and 're Hired!' as a third. What I want to do is: while($data = fread($fp, 4096)){ if(!xml_parse($xml_parser, htmlentities($data,ENT_QUOTES), feof($fp))) { break; } } But that keeps breaking. I

How do I find the string index of a tag (an element) without counting expanded entities?

时间秒杀一切 提交于 2019-12-24 13:42:17
问题 I've got a large piece of text which I want to be able to select, storing the selected part by its startindex and endindex . (For example, selecting or in word would give me startindex 1 and endindex 2 .) This all works properly, but I've got a problem with HTML entities such as & (the ampersand). I've created a little case in which the issue consists. You can see in the fiddle below that the startIndex inflates if you select anything beyond the & , because it doesn't count the & as a single

ImportError: No module named html.entities

戏子无情 提交于 2019-12-24 12:46:32
问题 I am new to python. I am using python 2.7.5. I want to write a web crawler. For that I have installed BeautifulSoup 4.3.2. I have installed it using this command(I haven't used pip) python setup.py install I am using eclipse 4.2 with pydev installed. When I try to import this library in my script from bs4 import BeautifulSoup I am getting this error ImportError: No module named html.entities Please explain me what should I do to rectify it. 回答1: Is there any reason why are you not using pip

PHP user input data security

和自甴很熟 提交于 2019-12-24 12:34:01
问题 I am trying to figure out which functions are best to use in different cases when inputting data, as well as outputting data. When I allow a user to input data into MySQL what is the best way to secure the data to prevent SQL injections and or any other type of injections or hacks someone could attempt? When I output the data as regular html from the database what is the best way to do this so scripts and such cannot be run? At the moment I basically only use mysql_real_escape_string();

Scrape using Beautiful Soup preserving   entities

人盡茶涼 提交于 2019-12-23 07:48:53
问题 I would like to scrape a table from the web and keep the   entities intact so that I can republish as HTML later. BeautifulSoup seems to be converting these to spaces though. Example: from bs4 import BeautifulSoup html = "<html><body><table><tr>" html += "<td> hello </td>" html += "</tr></table></body></html>" soup = BeautifulSoup(html) table = soup.find_all('table')[0] row = table.find_all('tr')[0] cell = row.find_all('td')[0] print cell observed result: <td> hello </td> required result: <td

Bullet “•” in XML

爱⌒轻易说出口 提交于 2019-12-23 03:09:07
问题 Similar to this question I am consuming an XML product that has some illegal chars in it. I seriously doubt I can get them to fix the problem, but I will try. In the meantime I'd like a work-around. The problem is that it contains a bullet. It renders as "•" in my source. I've tried a few encoding conversions but have not found a combination that works. (I'm not accustomed to even thinking about my encoding type, so I'm out of my element here.) So, I tried the below and it seems that str

How to convert & characters to HTML characters?

江枫思渺然 提交于 2019-12-22 10:25:10
问题 <?php echo "Hello World!"; ?> should be: <?php echo "Hello World!"; ?> How do I do that in PHP? 回答1: You need one of these: html_entity_decode() htmlspecialchars_decode() html_entity_decode() in PHP Manual htmlspecialchars_decode() in PHP Manual The main difference is that html_entity_decode() will translate all the HTML entities in your string ( < becomes < , á becomes á , etc.) while html_specialchars_decode() only translates some special HTML entities: The converted entities are: & , "

Why does the PHP function htmlentities(…) returns wrong results?

青春壹個敷衍的年華 提交于 2019-12-22 08:19:13
问题 I have the following code : function testAccents() { $str = "àéè"; $html = htmlentities($str); echo $html; } When I run it, instead of getting àéè I get àéè . I thought that it could be a problem of encoding but the file is utf-8 : > file -bi PublicationTest.php text/x-c++; charset=utf-8 Why do I get this strange result ? EDIT: I use PHP 5.3. 回答1: Before PHP 5.4.0, htmlentities() expects ISO-8859-1 data by default. It's interpreting your UTF-8 input as single-byte characters, which results

Why HTML decimal and HTML hex?

泪湿孤枕 提交于 2019-12-21 12:18:16
问题 I have tried to Google quite a while now for an answer why HTML entities can be compiled either in HTML decimal or HTML hex. So my questions are: What is the difference between HTML decimal and HTML hex? Why are there two systems to do the same thing? 回答1: Originally, HTML was nominally based on SGML, which has decimal character references only. Later, the hexadecimal alternative was added in HTML 4.01 (and soon implemented in browsers), then retrofitted into SGML in the Web Adaptations Annex

Why HTML decimal and HTML hex?

二次信任 提交于 2019-12-21 12:17:06
问题 I have tried to Google quite a while now for an answer why HTML entities can be compiled either in HTML decimal or HTML hex. So my questions are: What is the difference between HTML decimal and HTML hex? Why are there two systems to do the same thing? 回答1: Originally, HTML was nominally based on SGML, which has decimal character references only. Later, the hexadecimal alternative was added in HTML 4.01 (and soon implemented in browsers), then retrofitted into SGML in the Web Adaptations Annex