I\'ve got a string with HTML attributes:
$attribs = \' id= \"header \" class = \"foo bar\" style =\"background-color:#fff; color: red; \"\';
You could use a regular expression to extract that information:
$attribs = ' id= "header " class = "foo bar" style ="background-color:#fff; color: red; "';
$pattern = '/(\\w+)\s*=\\s*("[^"]*"|\'[^\']*\'|[^"\'\\s>]*)/';
preg_match_all($pattern, $attribs, $matches, PREG_SET_ORDER);
$attrs = array();
foreach ($matches as $match) {
if (($match[2][0] == '"' || $match[2][0] == "'") && $match[2][0] == $match[2][strlen($match[2])-1]) {
$match[2] = substr($match[2], 1, -1);
}
$name = strtolower($match[1]);
$value = html_entity_decode($match[2]);
switch ($name) {
case 'class':
$attrs[$name] = preg_split('/\s+/', trim($value));
break;
case 'style':
// parse CSS property declarations
break;
default:
$attrs[$name] = $value;
}
}
var_dump($attrs);
Now you just need to parse the classes of class (split at whitespaces) and property declarations of style (a little bit harder as it can contain comments and URLs with ; in it).