preg_match breaks with more then 5100 characters

对着背影说爱祢 提交于 2019-12-12 19:20:10

问题


I'm trying to match for plain text here is the regex

   $variable = "newteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststorynewteststory";
   $result = preg_match('/^([a-zA-Z0-9 \n\r,!@#\$\.\+\-%\^&\(\)~`\'":;_=\?\\/\|\<\>\*\{\}])+$/',$variable);

and it fails and I can't figure out why


回答1:


A few suggestions to try:

  1. Move the + inside of the capture group so it needs to do less internal shifting.

    $result = preg_match('/^([a-zA-Z0-9 \n\r,!@#\$\.\+\-%\^&\(\)~`\'":;_=\?\\/\|\<\>\*\{\}]+)$/',$variable);
    
  2. Why capture at all?

    $result = preg_match('/^[a-zA-Z0-9 \n\r,!@#\$\.\+\-%\^&\(\)~`\'":;_=\?\\/\|\<\>\*\{\}]+$/',$variable);
    
  3. Why not just negate so it doesn't need to build up the internal state:

    $result = !preg_match('/[^a-zA-Z0-9 \n\r,!@#\$\.\+\-%\^&\(\)~`\'":;_=\?\\/\|\<\>\*\{\}]/',$variable);
    

What I think's happening is that you're over-running the internal buffer that pcre is using to keep track of the state. Give a try using a negated preg match (since you're not using the captured group anyway, all you care is if it has an invalid character)...

And a nitpick: $.+^()?<>*{} don't need escaping at all inside of a character block. The only ones that do are your delimiter (/), - and a leading ^ character (which doesn't apply to you.




回答2:


You are matching too long text. It usually happen to me too when parsing very long subpatterns. Here are some things that you can try:

Remove the capturing parentheses:

$result = preg_match('/^[a-zA-Z0-9 \n\r,!@#\$\.\+\-%\^&\(\)~`\'":;_=\?\\/\|\<\>\*\{\}]+$/',$variable);

It might not work, as it still has to match very long text. Another way to do this is:

Do a negative match that matches only one character not in your set, and make sure that it does not match.

$result = preg_match('/[^a-zA-Z0-9 \n\r,!@#\$\.\+\-%\^&\(\)~`\'":;_=\?\\/\|\<\>\*\{\}]/',$variable);

$valid = !$result && strlen($variable) > 0;

P.S. Your code runs fine in my computer.



来源:https://stackoverflow.com/questions/5263555/preg-match-breaks-with-more-then-5100-characters

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!