Convert Regular Expression pattern from Javascript to PCRE (perl)

荒凉一梦 提交于 2020-01-05 04:01:10

问题


This is my javascript regex pattern:

    url = "http://www.amazon.com/gp";    
    hostname = /^((\\w+):\\/\\/\\/?)?((\\w+):?(\\w+)?@)?([^\\/\\?:]+):?(\\d+)?(\\/?[^\\?#;\\|]+)?([;\\|])?([^\\?#]+)?\\??([^#]+)?#?(\\w*)/.exec(url) || [];
// would return "www.amazon.com"
  • the above regex extracting the hostname from a given url. I need this line to work using pcre (c++). as you can see, I already added another '\' to each '\' but its still doesn't work.

what are the additional changes I need to do to make it work in pcre code instead of javascript? or maybe it isn't possible and I need to build entirely new pattern to make it work in pcre?

this is a simple version of my code:

int main(void)
{
    string text = "http://www.amazon.com";
    string hostname;
    pcrecpp::RE re("^((\\w+):\\/\\/\\/?)?((\\w+):?(\\w+)?@)?([^\\/\\?:]+):?(\\d+)?(\\/?[^\\?#;\\|]+)?([;\\|])?([^\\?#]+)?\\??([^#]+)?#?(\\w*)");
    if(re.PartialMatch(text, &hostname)) 
    {
        std::cout << "match: " << hostname << "\n";
    }else{
        std::cout << "no match. \n";
    }       
    return 0;
}

Thanks.


回答1:


There's no need to convert it, the only thing you have to take care of is the escaping and the / delimiter.

Do note that a regular expression might not be what you want to use here. Or atleast... not like this directly. There are lots of url parsing libraries that are a lot better suited for this task. HTParse for example.

Your C++ code should work but your regex has a lot of optional groups so it's hard to be sure in what group the hostname will end up.

As hacky as it may be, my edit works for this input

string text = "http://www.amazon.com";
string tmp;
string hostname;
pcrecpp::RE re("^((\\w+):\\/\\/\\/?)?((\\w+):?(\\w+)?@)?([^\\/\\?:]+):?(\\d+)?(\\/?[^\\?#;\\|]+)?([;\\|])?([^\\?#]+)?\\??([^#]+)?#?(\\w*)");
if(re.PartialMatch(text, &tmp, &tmp, &tmp, &tmp, &tmp, &hostname))
{
    std::cout << "match: " << hostname << "\n";
}else{
    std::cout << "no match. \n";
}



回答2:


"^((\\w+):\\/\\/\\/?)?((\\w+):?(\\w+)?@)?([^\\/\\?:]+):?(\\d+)?(\\/?[^\\?#;\\|]+)?([;\\|])?([^\\?#]+)?\\??([^#]+)?#?(\\w*)"


来源:https://stackoverflow.com/questions/2359721/convert-regular-expression-pattern-from-javascript-to-pcre-perl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!