Regexes work in PHP and don't in Erlang. Why?

这一生的挚爱 提交于 2019-12-25 03:22:35

问题


I tried to rewrite url parsing function written in PHP to Erlang. And I found that these regex don't work in Erlang but work fine in PHP code. Can you tell why and how to make it work with Erlang.

Loose = "^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)".

re:compile( Loose ). 
{error,{"nothing to repeat",166}}



Strict = "^(?:([^:\/?#]+):)?(?:\/\/\/?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?))?(((?:\/(\w:))?((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)".

re:compile( Strict ).                  
{error,{"nothing to repeat",114}}

But this code works fine:

$url = "http://gazeta.ru/";

$loose = '/^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\/?#]*)(?::(\d*))?)(((?:\/(\w:))?(\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/';

preg_match($loose, $url, $match);

var_dump( $match );

回答1:


The character "\" is special in strings in Erlang. There are other special characters which must be preceded by a backslash, these include doublequote and backslash. The technique of marking special characters is called escaping and backslash itself is called an escape character. So "\" must be followed with another character. For example if you want to include character '\' (one backslash) into a string you should write "\\":

CorrectString = "C:\\windows" %% Correct

WrongString = "C:\windows" %% Wrong

Hence you have to change all single backslashes in your regexp to double backslashes. Here is an example in erlang shell:

3> Loose = "^(?:(?![^:@]+:[^:@\\/]*@)([^:\\/?#.]+):)?(?:\\/\\/\\/?)?((?:(([^:@]*):?([^:@]*))?@)?([^:\\/?#]*)(?::(\\d*))?)(((?:\\/(\\w:))?(\\/(?:[^?#](?![^?#\\/]*\\.[^?#\\/.]+(?:[?#]|$)))*\\/?)?([^?#\\/]*))(?:\\?([^#]*))?(?:#(.*))?)".                                                                               
4> re:compile(Loose).
{ok,{re_pattern,14,0,                                                        
                <<69,82,67,80,147,2,0,0,16,0,0,0,1,0,0,0,14,0,0,0,0,0,0,
                  ...>>}}


来源:https://stackoverflow.com/questions/25352617/regexes-work-in-php-and-dont-in-erlang-why

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!