问题
I try to build RegExp to validate(preg_match) some path string for two following rules:
- path must consists only symbols from given range
[a-zA-z0-9-_\///\.]
- path will not consist an up directory sequence ".."
this is a correct path example: /user/temp
and the bad one: /../user
UPD:
/user/temp.../foo
will also be correct (thanks to Laurence Gonsalves)
回答1:
Consider this:
$right_path = '/user/temp';
$wrong_path = '/../user';
$almost_wrong_path = 'foo/abc../bar';
$almost_right_path = 'foo/../bar';
$pattern = '#^(?!.*[\\/]\.{2}[\\/])(?!\.{2}[\\/])[-\w.\\/]+$#';
var_dump(preg_match($pattern, $right_path)); // 1
var_dump(preg_match($pattern, $wrong_path)); // 0
var_dump(preg_match($pattern, $almost_wrong_path)); // 1
var_dump(preg_match($pattern, $almost_right_path)); // 0
I've actually built this pattern in three steps:
1) the first rule given said that only symbols allowed in the string are 0-9
, a-zA-Z
, _
(underscore), -
(hyphen), .
(dot) and both slashes (/
and \
). First three positions can be expressed with a shortcut (\w
), others require a character class:
[-\w.\\/]
Note two things here: 1) hyphen should be either the first or the last symbol in the character class (otherwise it's treated as a metacharacter used to define a range); 2) both dot and forward slash are not escaped yet (backslash is escaped, though; it's too powerful to be left alone, even within [...]
subexpression).
2) now we have to make sure that the pattern does indeed cover the whole string. We do it with so-called anchors - ^
for beginning of the string, $
for the end. And, not to forget that our string may consist of one or more allowed symbols (this expressed with +
quantifier). So the pattern becomes this:
^[-\w.\\/]+$
3) one last thing - we have to prevent using ../
and ..\
(preceded by /
or \
- or not, if ..[/\\]
sequence begins the string) as well.
The easiest way of expressing this rule is using so-called 'negative lookahead' test. It's written within (?!...) subexpression, and (in this case) describes the following idea: 'make sure that sequence of zero or more symbols is not followed by "slash-two dots-slash" sequence':
^(?!.*[\\/]\.{2}[\\/])(?!\.{2}[\\/])[-\w.\\/]+$
One last thing is actually placing the pattern into preg_match
function: as we use /
symbol within the regex, we can just choose another set of delimiters. In my example, I chose '#':
$pattern = '#^(?!.*[\\/]\.{2}[\\/])(?!\.{2}[\\/])[-\w.\\/]+$#';
See? It's real easy. ) You just have to start from small things and gradually develop them.
来源:https://stackoverflow.com/questions/13260371/check-directory-path-for-symbols-range-and-up-directory-sign