Cancelling a long running regex match?

后端 未结 7 1764
借酒劲吻你
借酒劲吻你 2020-11-27 04:25

Say I\'m running a service where users can submit a regex to search through lots of data. If the user submits a regex that is very slow (ie. takes minutes for Matcher.find()

7条回答
  •  失恋的感觉
    2020-11-27 04:57

    With a little variation it is possible to avoid using additional threads for this:

    public class RegularExpressionUtils {
    
        // demonstrates behavior for regular expression running into catastrophic backtracking for given input
        public static void main(String[] args) {
            Matcher matcher = createMatcherWithTimeout(
                    "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", "(x+x+)+y", 2000);
            System.out.println(matcher.matches());
        }
    
        public static Matcher createMatcherWithTimeout(String stringToMatch, String regularExpression, int timeoutMillis) {
            Pattern pattern = Pattern.compile(regularExpression);
            return createMatcherWithTimeout(stringToMatch, pattern, timeoutMillis);
        }
    
        public static Matcher createMatcherWithTimeout(String stringToMatch, Pattern regularExpressionPattern, int timeoutMillis) {
            CharSequence charSequence = new TimeoutRegexCharSequence(stringToMatch, timeoutMillis, stringToMatch,
                    regularExpressionPattern.pattern());
            return regularExpressionPattern.matcher(charSequence);
        }
    
        private static class TimeoutRegexCharSequence implements CharSequence {
    
            private final CharSequence inner;
    
            private final int timeoutMillis;
    
            private final long timeoutTime;
    
            private final String stringToMatch;
    
            private final String regularExpression;
    
            public TimeoutRegexCharSequence(CharSequence inner, int timeoutMillis, String stringToMatch, String regularExpression) {
                super();
                this.inner = inner;
                this.timeoutMillis = timeoutMillis;
                this.stringToMatch = stringToMatch;
                this.regularExpression = regularExpression;
                timeoutTime = System.currentTimeMillis() + timeoutMillis;
            }
    
            public char charAt(int index) {
                if (System.currentTimeMillis() > timeoutTime) {
                    throw new RuntimeException("Timeout occurred after " + timeoutMillis + "ms while processing regular expression '"
                                    + regularExpression + "' on input '" + stringToMatch + "'!");
                }
                return inner.charAt(index);
            }
    
            public int length() {
                return inner.length();
            }
    
            public CharSequence subSequence(int start, int end) {
                return new TimeoutRegexCharSequence(inner.subSequence(start, end), timeoutMillis, stringToMatch, regularExpression);
            }
    
            @Override
            public String toString() {
                return inner.toString();
            }
        }
    
    }
    

    Thanks a lot to dawce for pointing me to this solution in answer to an unnecessary complicated question !

提交回复
热议问题