Match pattern for all Google search pages

倖福魔咒の 提交于 2019-11-28 09:18:55

问题


I'm developing an extension which will perform a certain action on all Google search URLs - but not on other websites or Google pages. In natural language the match pattern is:

  • Any protocol ('*://')
  • Any subdomain or none ('www' or '')
  • The domain string must equal 'google'
  • Any TLD including three-letter TLDs (e.g. '.com') and multi-part country TLDs (e.g. '.co.uk')
  • The first 8 letters of the path must equal '/search?'

Many people say 'to match all google search pages use "*://*.google.com/search?*" but this is patently untrue as it will not match national TLDs like google.co.uk.

Thus the following code does not work at all:

chrome.webRequest.onBeforeRequest.addListener(
  function(details) {
    alert('This never happens');
  }, {
    urls: [
        "*://*.google.*/search?*",
        "*://google.*/search?*",
    ],
    types: ["main_frame"]
  },
  ["blocking"]
);

Using "*://*.google.com/search?*" as the match pattern does work, but I fear I would need a list of every single Google localisation for that to be an effective strategy.


回答1:


Unfortunately, match patterns do not allow wildcards for TLDs for security reasons.

You cannot use wildcard match patterns like http://google.*/* to match TLDs (like http://google.es and http://google.fr) due to the complexity of actually restricting such a match to only the desired domains.

For the example of http://google.*/*, the Google domains would be matched, but so would http://google.someotherdomain.com. Additionally, many sites do not own all of the TLDs for their domain. For an example, assume you want to use http://example.*/* to match http://example.com and http://example.es, but http://example.net is a hostile site. If your extension has a bug, the hostile site could potentially attack your extension in order to get access to your extension's increased privileges.

You should explicitly enumerate the TLDs that you wish to run your extension on.

A slightly unrealistic option would be to list all variants with all national TLDs.

Edit: thanks to an incredibly helpful comment by rsanchez, here's an up to date list of all Google domain variants which makes this approach viable.

A realistic option is to inject into a larger set of pages (for instance, all pages), then analyze the URL (with a regexp, for example) and only execute if it matches the pattern you are looking for. Yes, it will be a scarier permissions warning, and you will have to explain it to your users.




回答2:


Source: https://stackoverflow.com/a/16187588/6250024

I was wondering the same and found the same question with a better solution, which introduces the "include_globs" parameters.

"matches":        ["http://*/*", "https://*/*"],
"include_globs":  ["http://www.google.*/*", "https://www.google.*/*"],



回答3:


You can use match-pattern arrays of arbitrary length (though it slows down the browser when using more than 1000 or so). For your convenience, here is a updated list:

  "matches": [
    "*://*.google.com/*",
    "*://*.google.ad/*",
    "*://*.google.ae/*",
    "*://*.google.com.af/*",
    "*://*.google.com.ag/*",
    "*://*.google.com.ai/*",
    "*://*.google.al/*",
    "*://*.google.am/*",
    "*://*.google.co.ao/*",
    "*://*.google.com.ar/*",
    "*://*.google.as/*",
    "*://*.google.at/*",
    "*://*.google.com.au/*",
    "*://*.google.az/*",
    "*://*.google.ba/*",
    "*://*.google.com.bd/*",
    "*://*.google.be/*",
    "*://*.google.bf/*",
    "*://*.google.bg/*",
    "*://*.google.com.bh/*",
    "*://*.google.bi/*",
    "*://*.google.bj/*",
    "*://*.google.com.bn/*",
    "*://*.google.com.bo/*",
    "*://*.google.com.br/*",
    "*://*.google.bs/*",
    "*://*.google.bt/*",
    "*://*.google.co.bw/*",
    "*://*.google.by/*",
    "*://*.google.com.bz/*",
    "*://*.google.ca/*",
    "*://*.google.cd/*",
    "*://*.google.cf/*",
    "*://*.google.cg/*",
    "*://*.google.ch/*",
    "*://*.google.ci/*",
    "*://*.google.co.ck/*",
    "*://*.google.cl/*",
    "*://*.google.cm/*",
    "*://*.google.cn/*",
    "*://*.google.com.co/*",
    "*://*.google.co.cr/*",
    "*://*.google.com.cu/*",
    "*://*.google.cv/*",
    "*://*.google.com.cy/*",
    "*://*.google.cz/*",
    "*://*.google.de/*",
    "*://*.google.dj/*",
    "*://*.google.dk/*",
    "*://*.google.dm/*",
    "*://*.google.com.do/*",
    "*://*.google.dz/*",
    "*://*.google.com.ec/*",
    "*://*.google.ee/*",
    "*://*.google.com.eg/*",
    "*://*.google.es/*",
    "*://*.google.com.et/*",
    "*://*.google.fi/*",
    "*://*.google.com.fj/*",
    "*://*.google.fm/*",
    "*://*.google.fr/*",
    "*://*.google.ga/*",
    "*://*.google.ge/*",
    "*://*.google.gg/*",
    "*://*.google.com.gh/*",
    "*://*.google.com.gi/*",
    "*://*.google.gl/*",
    "*://*.google.gm/*",
    "*://*.google.gp/*",
    "*://*.google.gr/*",
    "*://*.google.com.gt/*",
    "*://*.google.gy/*",
    "*://*.google.com.hk/*",
    "*://*.google.hn/*",
    "*://*.google.hr/*",
    "*://*.google.ht/*",
    "*://*.google.hu/*",
    "*://*.google.co.id/*",
    "*://*.google.ie/*",
    "*://*.google.co.il/*",
    "*://*.google.im/*",
    "*://*.google.co.in/*",
    "*://*.google.iq/*",
    "*://*.google.is/*",
    "*://*.google.it/*",
    "*://*.google.je/*",
    "*://*.google.com.jm/*",
    "*://*.google.jo/*",
    "*://*.google.co.jp/*",
    "*://*.google.co.ke/*",
    "*://*.google.com.kh/*",
    "*://*.google.ki/*",
    "*://*.google.kg/*",
    "*://*.google.co.kr/*",
    "*://*.google.com.kw/*",
    "*://*.google.kz/*",
    "*://*.google.la/*",
    "*://*.google.com.lb/*",
    "*://*.google.li/*",
    "*://*.google.lk/*",
    "*://*.google.co.ls/*",
    "*://*.google.lt/*",
    "*://*.google.lu/*",
    "*://*.google.lv/*",
    "*://*.google.com.ly/*",
    "*://*.google.co.ma/*",
    "*://*.google.md/*",
    "*://*.google.me/*",
    "*://*.google.mg/*",
    "*://*.google.mk/*",
    "*://*.google.ml/*",
    "*://*.google.com.mm/*",
    "*://*.google.mn/*",
    "*://*.google.ms/*",
    "*://*.google.com.mt/*",
    "*://*.google.mu/*",
    "*://*.google.mv/*",
    "*://*.google.mw/*",
    "*://*.google.com.mx/*",
    "*://*.google.com.my/*",
    "*://*.google.co.mz/*",
    "*://*.google.com.na/*",
    "*://*.google.com.nf/*",
    "*://*.google.com.ng/*",
    "*://*.google.com.ni/*",
    "*://*.google.ne/*",
    "*://*.google.nl/*",
    "*://*.google.no/*",
    "*://*.google.com.np/*",
    "*://*.google.nr/*",
    "*://*.google.nu/*",
    "*://*.google.co.nz/*",
    "*://*.google.com.om/*",
    "*://*.google.com.pa/*",
    "*://*.google.com.pe/*",
    "*://*.google.com.pg/*",
    "*://*.google.com.ph/*",
    "*://*.google.com.pk/*",
    "*://*.google.pl/*",
    "*://*.google.pn/*",
    "*://*.google.com.pr/*",
    "*://*.google.ps/*",
    "*://*.google.pt/*",
    "*://*.google.com.py/*",
    "*://*.google.com.qa/*",
    "*://*.google.ro/*",
    "*://*.google.ru/*",
    "*://*.google.rw/*",
    "*://*.google.com.sa/*",
    "*://*.google.com.sb/*",
    "*://*.google.sc/*",
    "*://*.google.se/*",
    "*://*.google.com.sg/*",
    "*://*.google.sh/*",
    "*://*.google.si/*",
    "*://*.google.sk/*",
    "*://*.google.com.sl/*",
    "*://*.google.sn/*",
    "*://*.google.so/*",
    "*://*.google.sm/*",
    "*://*.google.sr/*",
    "*://*.google.st/*",
    "*://*.google.com.sv/*",
    "*://*.google.td/*",
    "*://*.google.tg/*",
    "*://*.google.co.th/*",
    "*://*.google.com.tj/*",
    "*://*.google.tk/*",
    "*://*.google.tl/*",
    "*://*.google.tm/*",
    "*://*.google.tn/*",
    "*://*.google.to/*",
    "*://*.google.com.tr/*",
    "*://*.google.tt/*",
    "*://*.google.com.tw/*",
    "*://*.google.co.tz/*",
    "*://*.google.com.ua/*",
    "*://*.google.co.ug/*",
    "*://*.google.co.uk/*",
    "*://*.google.com.uy/*",
    "*://*.google.co.uz/*",
    "*://*.google.com.vc/*",
    "*://*.google.co.ve/*",
    "*://*.google.vg/*",
    "*://*.google.co.vi/*",
    "*://*.google.com.vn/*",
    "*://*.google.vu/*",
    "*://*.google.ws/*",
    "*://*.google.rs/*",
    "*://*.google.co.za/*",
    "*://*.google.co.zm/*",
    "*://*.google.co.zw/*",
    "*://*.google.cat/*"
  ],

To recreate, you can use the command

curl https://www.google.com/supported_domains | sed 's!\(.*\)!"*://*\1/*",!g'


来源:https://stackoverflow.com/questions/23747781/match-pattern-for-all-google-search-pages

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!