I need to detect whether a string contains HTML tags.
if(!preg_match(\'(?<=<)\\w+(?=[^<]*?>)\', $string)){
return $string;
}
Parsing HTML in general is a hard problem, there is some good material here:
But regarding your question ('better' solution) - can be more specific regarding what you are trying to achieve, and what tools are available to you?
I would use strlen()
because if you don't, then a character-by-character comparison is done and that can be slow, though I would expect the comparison to quit as soon as it found a difference.
If you just want to detect/replace certain tags: This function will search for certain html tags and encapsulate them in brackets - which is pretty senseless - just modify it to whatever you want to do with the tags.
$html = preg_replace_callback(
'|\</?([a-zA-Z]+[1-6]?)(\s[^>]*)?(\s?/)?\>|',
function ($found) {
if(isset($found[1]) && in_array(
$found[1],
array('div','p','span','b','a','strong','center','br','h1','h2','h3','h4','h5','h6','hr'))
) {
return '[' . $found[0] . ']';
};
},
$html
);
Explaination of the regex:
\< ... \> //start and ends with tag brackets
\</? //can start with a slash for closing tags
([a-zA-Z]+[1-6]?) //the tag itself (for example "h1")
(\s[^>]*)? //anything such as class=... style=... etc.
(\s?/)? //allow self-closing tags such as <br />
you need to 'delimit' the regex with some character or another. Try this:
if(!preg_match('#(?<=<)\w+(?=[^<]*?>)#', $string)){
return $string;
}
A simple solution is:
if($string != strip_tags($string)) {
// contains HTML
}
The benefit of this over a regex is it's easier to understand, however I could not comment on the speed of execution of either solution.
If purpose is just to check if string contain html tag or not. No matter html tags are valid or not. Then you can try this.
function is_html($string) {
// Check if string contains any html tags.
return preg_match('/<\s?[^\>]*\/?\s?>/i', $string);
}
This works for all valid or invalid html tags. You can check confirm here https://regex101.com/r/2g7Fx4/3