I need to dynamically construct an XPath query for an element attribute, where the attribute value is provided by the user. I\'m unsure how to go about cleaning or sanitizi
XPath does actually include a method of doing this safely, in that it permits variable references in the form $varname
in expressions. The library on which PHP's SimpleXML is based provides an interface to supply variables, however this is not exposed by the xpath function in your example.
As a demonstration of really how simple this can be:
>>> from lxml import etree
>>> n = etree.fromstring(' ')
>>> n.xpath("@a=$maybeunsafe", maybeunsafe='He said "I\'m here"')
True
That's using lxml, a python wrapper for the same underlying library as SimpleXML, with a similar xpath function. Booleans, numbers, and node-sets can also be passed directly.
If switching to a more capable XPath interface is not an option, a workaround when given external string would be something (feel free to adapt to PHP) along the lines of:
def safe_xpath_string(strvar):
if "'" in strvar:
return "',\"'\",'".join(strvar.split("'")).join(("concat('","')"))
return strvar.join("''")
The return value can be directly inserted in your expression string. As that's not actually very readable, here is how it behaves:
>>> print safe_xpath_string("basic")
'basic'
>>> print safe_xpath_string('He said "I\'m here"')
concat('He said "I',"'",'m here"')
Note, you can't use escaping in the form '
outside of an XML document, nor are generic XML serialisation routines applicable. However, the XPath concat function can be used to create a string with both types of quotes in any context.
PHP variant:
function safe_xpath_string($value)
{
$quote = "'";
if (FALSE === strpos($value, $quote))
return $quote.$value.$quote;
else
return sprintf("concat('%s')", implode("', \"'\", '", explode($quote, $value)));
}