is anybody aware of a list of exactly what triggers ASP.NET\'s HttpRequestValidationException? [This is behind the common error: \"A potentially dangerous Request.Form value
Warning: Some parts of the code in the original answer (below) were removed and marked as OBSOLETE.
http://referencesource.microsoft.com/#System.Web/CrossSiteScriptingValidation.cs
After checking the newest code you will probably agree that what Travis Illig explained are the only validations used now in 2018 (and seems to have no changes since 2014 when the source was released in GitHub). But the old code below may still be relevant if you use an older version of the framework.
Using Reflector, I did some browsing. Here's the raw code. When I have time I will translate this into some meaningful rules:
The HttpRequestValidationException
is thrown by only a single method in the System.Web
namespace, so it's rather isolated. Here is the method:
private void ValidateString(string s, string valueName, string collectionName)
{
int matchIndex = 0;
if (CrossSiteScriptingValidation.IsDangerousString(s, out matchIndex))
{
string str = valueName + "=\"";
int startIndex = matchIndex - 10;
if (startIndex <= 0)
{
startIndex = 0;
}
else
{
str = str + "...";
}
int length = matchIndex + 20;
if (length >= s.Length)
{
length = s.Length;
str = str + s.Substring(startIndex, length - startIndex) + "\"";
}
else
{
str = str + s.Substring(startIndex, length - startIndex) + "...\"";
}
throw new HttpRequestValidationException(HttpRuntime.FormatResourceString("Dangerous_input_detected", collectionName, str));
}
}
That method above makes a call to the IsDangerousString
method in the CrossSiteScriptingValidation
class, which validates the string against a series of rules. It looks like the following:
internal static bool IsDangerousString(string s, out int matchIndex)
{
matchIndex = 0;
int startIndex = 0;
while (true)
{
int index = s.IndexOfAny(startingChars, startIndex);
if (index < 0)
{
return false;
}
if (index == (s.Length - 1))
{
return false;
}
matchIndex = index;
switch (s[index])
{
case 'E':
case 'e':
if (IsDangerousExpressionString(s, index))
{
return true;
}
break;
case 'O':
case 'o':
if (!IsDangerousOnString(s, index))
{
break;
}
return true;
case '&':
if (s[index + 1] != '#')
{
break;
}
return true;
case '<':
if (!IsAtoZ(s[index + 1]) && (s[index + 1] != '!'))
{
break;
}
return true;
case 'S':
case 's':
if (!IsDangerousScriptString(s, index))
{
break;
}
return true;
}
startIndex = index + 1;
}
}
That IsDangerousString
method appears to be referencing a series of validation rules, which are outlined below:
private static bool IsDangerousExpressionString(string s, int index)
{
if ((index + 10) >= s.Length)
{
return false;
}
if ((s[index + 1] != 'x') && (s[index + 1] != 'X'))
{
return false;
}
return (string.Compare(s, index + 2, "pression(", 0, 9, true, CultureInfo.InvariantCulture) == 0);
}
-
private static bool IsDangerousOnString(string s, int index)
{
if ((s[index + 1] != 'n') && (s[index + 1] != 'N'))
{
return false;
}
if ((index > 0) && IsAtoZ(s[index - 1]))
{
return false;
}
int length = s.Length;
index += 2;
while ((index < length) && IsAtoZ(s[index]))
{
index++;
}
while ((index < length) && char.IsWhiteSpace(s[index]))
{
index++;
}
return ((index < length) && (s[index] == '='));
}
-
private static bool IsAtoZ(char c)
{
return (((c >= 'a') && (c <= 'z')) || ((c >= 'A') && (c <= 'Z')));
}
-
private static bool IsDangerousScriptString(string s, int index)
{
int length = s.Length;
if ((index + 6) >= length)
{
return false;
}
if ((((s[index + 1] != 'c') && (s[index + 1] != 'C')) || ((s[index + 2] != 'r') && (s[index + 2] != 'R'))) || ((((s[index + 3] != 'i') && (s[index + 3] != 'I')) || ((s[index + 4] != 'p') && (s[index + 4] != 'P'))) || ((s[index + 5] != 't') && (s[index + 5] != 'T'))))
{
return false;
}
index += 6;
while ((index < length) && char.IsWhiteSpace(s[index]))
{
index++;
}
return ((index < length) && (s[index] == ':'));
}
So there you have it. It's not pretty to decipher, but it's all there.
Try this regular expresson pattern.
You may need to ecape the \ for javascript ex \\
var regExpPattern = '[eE][xX][pP][rR][eE][sS][sS][iI][oO][nN]\\(|\\b[oO][nN][a-zA-Z]*\\b\\s*=|&#|<[!/a-zA-Z]|[sS][cC][rR][iI][pP][tT]\\s*:';
var re = new RegExp("","gi");
re.compile(regExpPattern,"gi");
var outString = null;
outString = re.exec(text);
I couldn't find a document outlining a conclusive list, but looking through Reflector and doing some analysis on use of HttpRequestValidationException, it looks like validation errors on the following can cause the request validation to fail:
The question, then, is "what qualifies one of these things as a dangerous input?" That seems to happen during an internal method System.Web.CrossSiteScriptingValidation.IsDangerousString(string, out int) which looks like it decides this way:
<
or &
in the value. If it's not there, or if it's the last character in the value, then the value is OK.&
character is in a &#
sequence (e.g.,  
for a non-breaking space), it's a "dangerous string."<
character is part of <x
(where "x" is any alphabetic character a-z), <!
, </
, or <?
, it's a "dangerous string."The System.Web.CrossSiteScriptingValidation type seems to have other methods in it for determining if things are dangerous URLs or valid JavaScript IDs, but those don't appear, at least through Reflector analysis, to result in throwing HttpRequestValidationExceptions.
How about this script? Your code can not detect this script, right?
";}alert(1);function%20a(){//
From MSDN:
'The exception that is thrown when a potentially malicious input string is received from the client as part of the request data. '
Many times this happens when JavaScript changes the values of a server side control in a way that causes the ViewState to not agree with the posted data.
Following on from Travis' answer, the list of 'dangerous' character sequences can be simplified as follows;
Based on this, in an ASP.Net MVC web app the following Regex validation attribute can be used on a model field to trigger client side validation before an HttpRequestValidationException is thrown when the form is submitted;
[RegularExpression(@"^(?![\s\S]*(&#|<[a-zA-Z!\/?]))[\s\S]*$", ErrorMessage = "This field does not support HTML or allow any of the following character sequences; "&#", "<A" through to "<Z" (upper and lower case), "<!", "</" or "<?".")]
Note that validation attribute error messages are HTML encoded when output by server side validation, but not when used in client side validation, so this one is already encoded as we only intend to see it with client side validation.