Regular expression to detect the search engine and search words

折月煮酒 提交于 2019-12-21 06:26:26

问题


I need to detect search engines that refers to my website. Since every search engine has different query strings for searching(e.g. google uses 'q=', yahoo uses 'p=') I created a database for search engines with their url regex patterns.

As an example: http://www.google.com/search?q=blabla&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-GB:official&client=firefox-a

the regex I created for google is:

(http:)(\\/)(\\/)(www)(\\.)(google)(\\.).*(\\/)(search).*(&q=|\\?q=).*

(I am a newbie on regex, but so far it works)

This detects that the url belongs to Google. My problem is that I need to extract the search words from the url above or from other search engines. But I dont know how to match it with the regular expression. I have tried extracting the query string from the url by using PHP functions and match it against the pattern, but it returned nothing.

Hope I could explain this clear enough.

Any suggestion?


回答1:


This blog entry about extracting keywords from the referrer seems like it is a good match for solving your problem.

I found it using this search for 'extract query string from google referer url'. The search seems to have a number of helpful hits... I just did a sweep of the first few.




回答2:


I would use parse_url to parse the URL and parse_str to parse the URL query.

$url = 'http://www.google.com/search?q=blabla&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu%3Aen-GB%3Aofficial&client=firefox-a';
$parts = parse_url($url);
if (isset($parts['query'])) {
    parse_str($parts['query'], $parts['query']);
}
var_dump($parts);


来源:https://stackoverflow.com/questions/1963883/regular-expression-to-detect-the-search-engine-and-search-words

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!