How does Google Custom search, search the web and why does it like so much domains with edu?

回眸只為那壹抹淺笑 提交于 2021-02-10 18:32:44

问题


I'm trying to use the Google Custom API to search for a certain keyword however it seems that the returned JSON that contains the links to the websites which "match" my keyword are totally irrelevant to what I have searched. I have noticed that anything searched will return 80% domains which end with edu even though my keyword is gum guard for example.

I don't mind domains ending with edu however I thought this API returns the first websites that will get returned whenever I go to my Google Chrome and type gum guard (in my example). Searching for gum guard using Google in a browser returns several websites which are relevant (Amazon, etc ... The JSON returned by the API doesn't return Amazon nor does it return anything from the first page from the browser). This confirms the fact that the API doesn't actually return the websites that a simple Google search through the browser will.

Do I have to specify to the API to return what the browser will? What other API could I use to achieve what I'm looking for?

Here is the irrelevant json response from google

    {
 "kind": "customsearch#search",
 "url": {
  "type": "application/json",
  "template": "https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&cref={cref?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json"
 },
 "queries": {
  "nextPage": [
   {
    "title": "Google Custom Search - gum guards",
    "totalResults": "2710000",
    "searchTerms": "gum guards",
    "count": 10,
    "startIndex": 11,
    "inputEncoding": "utf8",
    "outputEncoding": "utf8",
    "safe": "off",
    "cx": "017576662512468239146:omuauf_lfve"
   }
  ],
  "request": [
   {
    "title": "Google Custom Search - gum guards",
    "totalResults": "2710000",
    "searchTerms": "gum guards",
    "count": 10,
    "startIndex": 1,
    "inputEncoding": "utf8",
    "outputEncoding": "utf8",
    "safe": "off",
    "cx": "017576662512468239146:omuauf_lfve"
   }
  ]
 },
 "context": {
  "title": "CS Curriculum",
  "facets": [
   [
    {
     "label": "lectures",
     "anchor": "Lectures",
     "label_with_op": "more:lectures"
    }
   ],
   [
    {
     "label": "assignments",
     "anchor": "Assignments",
     "label_with_op": "more:assignments"
    }
   ],
   [
    {
     "label": "reference",
     "anchor": "Reference",
     "label_with_op": "more:reference"
    }
   ]
  ]
 },
 "searchInformation": {
  "searchTime": 0.406893,
  "formattedSearchTime": "0.41",
  "totalResults": "2710000",
  "formattedTotalResults": "2,710,000"
 },
 "items": [
  {
   "kind": "customsearch#result",
   "title": "Decomposing an integer as sum of two squares",
   "htmlTitle": "Decomposing an integer as sum of two squares",
   "link": "https://www.cs.utexas.edu/users/EWD/ewd10xx/EWD1032.PDF",
   "displayLink": "www.cs.utexas.edu",
   "snippet": "DacompOSingL on 'ln}€q¢f' as gum OP 'I-wo squares. I \\J v. I (I had no\\' I'flcmned \n+0 wr'H-e .... which Sujjes} “we guard xg and “HG Fifi-her arm} so 'mhreshn _ ...",
   "htmlSnippet": "DacompOSingL on 'ln}€q¢f' as \u003cb\u003egum\u003c/b\u003e OP 'I-wo squares. I \\J v. I (I had no\\' I'flcmned \u003cbr\u003e\n+0 wr'H-e .... which Sujjes} “we \u003cb\u003eguard\u003c/b\u003e xg and “HG Fifi-her arm} so 'mhreshn _ ...",
   "cacheId": "6kLsXUvB8OcJ",
   "mime": "application/pdf",
   "fileFormat": "PDF/Adobe Acrobat",
   "formattedUrl": "https://www.cs.utexas.edu/users/EWD/ewd10xx/EWD1032.PDF",
   "htmlFormattedUrl": "https://www.cs.utexas.edu/users/EWD/ewd10xx/EWD1032.PDF",
   "pagemap": {
    "metatags": [
     {
      "moddate": "Fri Mar  3 06:46:07 2000",
      "creator": "Adobe PageMaker 6.52",
      "author": "Administrator",
      "subject": "ewd1032",
      "producer": "Acrobat Distiller 4.0 for Windows",
      "creationdate": "Fri Mar  3 12:46:06 2000"
     }
    ]
   }
  },
  {
   "kind": "customsearch#result",
   "title": "Full Course Reader - Stanford University",
   "htmlTitle": "Full Course Reader - Stanford University",
   "link": "http://www.stanford.edu/class/cs106l/course-reader/full_course_reader.pdf",
   "displayLink": "www.stanford.edu",
   "snippet": "There are many ways to write include guards, but one ...... map, an atom, a gum, \na kit, a baleen, a gala, a ten, a don, a mural, a pan, a faun, a ducat, a pagoda ...",
   "htmlSnippet": "There are many ways to write include \u003cb\u003eguards\u003c/b\u003e, but one ...... map, an atom, a \u003cb\u003egum\u003c/b\u003e, \u003cbr\u003e\na kit, a baleen, a gala, a ten, a don, a mural, a pan, a faun, a ducat, a pagoda ...",
   "mime": "application/pdf",
   "fileFormat": "PDF/Adobe Acrobat",
   "formattedUrl": "www.stanford.edu/class/cs106l/course-reader/full_course_reader.pdf",
   "htmlFormattedUrl": "www.stanford.edu/class/cs106l/course-reader/full_course_reader.pdf",
   "pagemap": {
    "cse_image": [
     {
      "src": "x-raw-image:///9d911bb58ab6b4c5a65ca944f233ed3f9a2190dfa2b61f975ad68a713143a787"
     }
    ],
    "cse_thumbnail": [
     {
      "width": "256",
      "height": "197",
      "src": "https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcRciEJyvz3Cd2n9Pp3I0lhNxjVTKvKHuiT2npoRD9MXl2mvxCL9m2ZoSi4"
     }
    ],
    "metatags": [
     {
      "title": "CS106L Course Reader",
      "author": "Keith Schwarz",
      "creator": "Writer",
      "producer": "OpenOffice.org 3.4",
      "creationdate": "D:20130424224219-07'00'"
     }
    ]
   }
  },
  {
   "kind": "customsearch#result",
   "title": "The First World War Diary of Lupton Kaylor",
   "htmlTitle": "The First World War Diary of Lupton Kaylor",
   "link": "https://www.cs.utexas.edu/users/cline/LLK_Diary/LLK_Diary_The_War_with_Germanyold.pdf",
   "displayLink": "www.cs.utexas.edu",
   "snippet": "of work by not doing any fatigue1 or guard duty. The eats were much ...... \nReceived three gift[s] of goodies & tobacco, cigarettes & chewing gum from the “Y\n”,.",
   "htmlSnippet": "of work by not doing any fatigue1 or \u003cb\u003eguard\u003c/b\u003e duty. The eats were much ...... \u003cbr\u003e\nReceived three gift[s] of goodies & tobacco, cigarettes & chewing \u003cb\u003egum\u003c/b\u003e from the “Y\u003cbr\u003e\n”,.",
   "cacheId": "utxIYtTGiYcJ",
   "mime": "application/pdf",
   "fileFormat": "PDF/Adobe Acrobat",
   "formattedUrl": "https://www.cs.utexas.edu/.../LLK_Diary_The_War_with_Germanyold.pdf",
   "htmlFormattedUrl": "https://www.cs.utexas.edu/.../LLK_Diary_The_War_with_Germanyold.pdf",
   "pagemap": {
    "cse_image": [
     {
      "src": "x-raw-image:///53d8322fe2c384f82878a5fe26ac9abb9bd5537787e014be353f105168304940"
     }
    ],
    "cse_thumbnail": [
     {
      "width": "193",
      "height": "261",
      "src": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTR5evzfHcjXcQIyMwYFGMY8s30g02arz36cIXyn1ghRIu6QQ2NZ4-Umck"
     }
    ],
    "metatags": [
     {
      "creator": "diary.PUB - Microsoft Publisher",
      "creationdate": "D:20041208121209",
      "title": "C:\\Documents and Settings\\Alan Kaylor Cline\\My Documents\\Diary_LLK_The_War_with_Germany.pdf",
      "author": "Alan Kaylor Cline",
      "producer": "Acrobat PDFWriter 5.0 for Windows NT"
     }
    ]
   }
  },
  {
   "kind": "customsearch#result",
   "title": "Testing and Grading - EECS Instruction - University of California ...",
   "htmlTitle": "Testing and Grading - EECS Instruction - University of California \u003cb\u003e...\u003c/b\u003e",
   "link": "https://www-inst.eecs.berkeley.edu/~cs375/sp14/book/Tools_for_Teaching_2nd_Edition_PART_VIII_Testing_And_Grading.pdf",
   "displayLink": "www-inst.eecs.berkeley.edu",
   "snippet": "coded M & Ms for signaling answers and the use of a gum wrapper as a crib \nsheet. YouTube ...... dents ' names to increase objectivity and guard against bias.",
   "htmlSnippet": "coded M & Ms for signaling answers and the use of a \u003cb\u003egum\u003c/b\u003e wrapper as a crib \u003cbr\u003e\nsheet. YouTube ...... dents ' names to increase objectivity and \u003cb\u003eguard\u003c/b\u003e against bias.",
   "cacheId": "DbT9nrjcCOsJ",
   "mime": "application/pdf",
   "fileFormat": "PDF/Adobe Acrobat",
   "formattedUrl": "https://www-inst.eecs.berkeley.edu/.../Tools_for_Teaching_2nd_Edition_ PART_VIII_Testing_And_Grading.pdf",
   "htmlFormattedUrl": "https://www-inst.eecs.berkeley.edu/.../Tools_for_Teaching_2nd_Edition_ PART_VIII_Testing_And_Grading.pdf",
   "pagemap": {
    "metatags": [
     {
      "creationdate": "D:20140212112830",
      "creator": "Google",
      "producer": "Google"
     }
    ]
   }
  },
  {
   "kind": "customsearch#result",
   "title": "When such claims and litigation extend beyond the period , the ...",
   "htmlTitle": "When such claims and litigation extend beyond the period , the \u003cb\u003e...\u003c/b\u003e",
   "link": "http://cs.jhu.edu/~jason/465/hw-ofst/tagger/datahelp/entrain.withSA.txt",
   "displayLink": "cs.jhu.edu",
   "snippet": "A spot honoring Bill White , the inventor of chewing gum , shows a woman trying \n..... its pharmaceuticals subsidiary agreed to supply collagen corneal shields for ...",
   "htmlSnippet": "A spot honoring Bill White , the inventor of chewing \u003cb\u003egum\u003c/b\u003e , shows a woman trying \u003cbr\u003e\n..... its pharmaceuticals subsidiary agreed to supply collagen corneal \u003cb\u003eshields\u003c/b\u003e for ...",
   "cacheId": "FxEdWvhjFswJ",
   "mime": "text/plain",
   "formattedUrl": "cs.jhu.edu/~jason/465/hw-ofst/tagger/datahelp/entrain.withSA.txt",
   "htmlFormattedUrl": "cs.jhu.edu/~jason/465/hw-ofst/tagger/datahelp/entrain.withSA.txt"
  },
  {
   "kind": "customsearch#result",
   "title": "greimlin 671 grianghrafad6ir greimlin, m. (gs. ~, pL ~i). Med ...",
   "htmlTitle": "greimlin 671 grianghrafad6ir greimlin, m. (gs. ~, pL ~i). Med \u003cb\u003e...\u003c/b\u003e",
   "link": "https://www.cs.tcd.ie/disciplines/intelligent_systems/clg/clg_web/L/LexSystem/NewIrishDictionaryDataandParser/dict68.txt",
   "displayLink": "www.cs.tcd.ie",
   "snippet": "... m = GUAJRE? guardal(l), ~ach GUAIRDEAI~L. -ACH guard~n = GUA!RNEAN \n.... Arb: Gum. Crann ~,gum-tree. ~arabach, gum arabic. '--'peirce, gutta-percha.",
   "htmlSnippet": "... m = GUAJRE? guardal(l), ~ach GUAIRDEAI~L. -ACH \u003cb\u003eguard\u003c/b\u003e~n = GUA!RNEAN \u003cbr\u003e\n.... Arb: \u003cb\u003eGum\u003c/b\u003e. Crann ~,\u003cb\u003egum\u003c/b\u003e-tree. ~arabach, \u003cb\u003egum\u003c/b\u003e arabic. '--'peirce, gutta-percha.",
   "cacheId": "62bdHC_BpNUJ",
   "mime": "text/plain",
   "formattedUrl": "https://www.cs.tcd.ie/disciplines/intelligent_systems/.../dict68.txt",
   "htmlFormattedUrl": "https://www.cs.tcd.ie/disciplines/intelligent_systems/.../dict68.txt"
  },
  {
   "kind": "customsearch#result",
   "title": "a cappella,abbandono,accrescendo,affettuoso,agilmente,agitato ...",
   "htmlTitle": "a cappella,abbandono,accrescendo,affettuoso,agilmente,agitato \u003cb\u003e...\u003c/b\u003e",
   "link": "http://www.cse.ohio-state.edu/~wallacch/thesaurus",
   "displayLink": "www.cse.ohio-state.edu",
   "snippet": "... gestae,rose,sable,saltire,scutcheon,shield,spread eagle,step,stroke,stunt ...... ,\ngluten,glutenous,glutinose,glutinous,gooey,grumous,gum,gumbo,gumbolike ...",
   "htmlSnippet": "... gestae,rose,sable,saltire,scutcheon,\u003cb\u003eshield\u003c/b\u003e,spread eagle,step,stroke,stunt ...... ,\u003cbr\u003e\ngluten,glutenous,glutinose,glutinous,gooey,grumous,\u003cb\u003egum\u003c/b\u003e,gumbo,gumbolike ...",
   "mime": "text/plain",
   "formattedUrl": "www.cse.ohio-state.edu/~wallacch/thesaurus",
   "htmlFormattedUrl": "www.cse.ohio-state.edu/~wallacch/thesaurus"
  },
  {
   "kind": "customsearch#result",
   "title": "A Visual Modality for the Augmentation of Paper",
   "htmlTitle": "A Visual Modality for the Augmentation of Paper",
   "link": "http://www.acm.org/icmi/2001/PUI-2001/a2.pdf",
   "displayLink": "www.acm.org",
   "snippet": "name AG for “advanced guard.” The other feature structure ... odds of touching \nthe board (either by rubbing along the gum line or dabbing a point thereon) does\n ...",
   "htmlSnippet": "name AG for “advanced \u003cb\u003eguard\u003c/b\u003e.” The other feature structure ... odds of touching \u003cbr\u003e\nthe board (either by rubbing along the \u003cb\u003egum\u003c/b\u003e line or dabbing a point thereon) does\u003cbr\u003e\n ...",
   "cacheId": "N7tXMImzoFEJ",
   "mime": "application/pdf",
   "fileFormat": "PDF/Adobe Acrobat",
   "formattedUrl": "www.acm.org/icmi/2001/PUI-2001/a2.pdf",
   "htmlFormattedUrl": "www.acm.org/icmi/2001/PUI-2001/a2.pdf",
   "pagemap": {
    "cse_image": [
     {
      "src": "x-raw-image:///58b3bd5cddc8af4a254e5ff907bfcb937b19e788ad7b0d87dc939eeeba50e10d"
     }
    ],
    "cse_thumbnail": [
     {
      "width": "270",
      "height": "186",
      "src": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcQv_rPdjW43AST6Pifq0F8l9EO76tn3k24etRsD9pN1Cj0QfboWIOku5DA"
     }
    ],
    "metatags": [
     {
      "creationdate": "D:20011001165911",
      "producer": "Acrobat Distiller 4.0 for Windows",
      "creator": "Windows NT 4.0",
      "title": "A Visual Modality for the Augmentation of Paper",
      "moddate": "D:20011010203501-07'00'",
      "author": "David R. McGee  ,   Misha Pavel,  Adriana Adami, Guoping   Wang, and Philip R. Cohen"
     }
    ]
   }
  },
  {
   "kind": "customsearch#result",
   "title": "Contents Preface #i 1 Introduction 1 I PreliHinRries 7 2 Overview 9 ...",
   "htmlTitle": "Contents Preface #i 1 Introduction 1 I PreliHinRries 7 2 Overview 9 \u003cb\u003e...\u003c/b\u003e",
   "link": "https://www.cs.utexas.edu/users/moore/publications/acl2-books/acs/excerpts.pdf",
   "displayLink": "www.cs.utexas.edu",
   "snippet": "W e omit the guard below , which allows the use of e q l below. ( defu R ...... th is \nnaiv e a r gum e n t assum e s th a t the n e w i te m is a n o d e in .W h a t if i t.",
   "htmlSnippet": "W e omit the \u003cb\u003eguard\u003c/b\u003e below , which allows the use of e q l below. ( defu R ...... th is \u003cbr\u003e\nnaiv e a r \u003cb\u003egum\u003c/b\u003e e n t assum e s th a t the n e w i te m is a n o d e in .W h a t if i t.",
   "mime": "application/pdf",
   "fileFormat": "PDF/Adobe Acrobat",
   "formattedUrl": "https://www.cs.utexas.edu/users/moore/publications/acl2.../excerpts.pdf",
   "htmlFormattedUrl": "https://www.cs.utexas.edu/users/moore/publications/acl2.../excerpts.pdf",
   "pagemap": {
    "metatags": [
     {
      "producer": "Aladdin Ghostscript 6.01"
     }
    ]
   }
  },
  {
   "kind": "customsearch#result",
   "title": "end",
   "htmlTitle": "end",
   "link": "http://www.cs.columbia.edu/~sedwards/presentations/ccu2004.pdf",
   "displayLink": "www.cs.columbia.edu",
   "snippet": "Design a vending machine controller that dispenses gum once. ... dime have \nbeen inserted, and a single output, GUM, ...... to add guard variable or copy. ⇒.",
   "htmlSnippet": "Design a vending machine controller that dispenses \u003cb\u003egum\u003c/b\u003e once. ... dime have \u003cbr\u003e\nbeen inserted, and a single output, \u003cb\u003eGUM\u003c/b\u003e, ...... to add \u003cb\u003eguard\u003c/b\u003e variable or copy. ⇒.",
   "cacheId": "7oFO304KMz0J",
   "mime": "application/pdf",
   "fileFormat": "PDF/Adobe Acrobat",
   "formattedUrl": "www.cs.columbia.edu/~sedwards/presentations/ccu2004.pdf",
   "htmlFormattedUrl": "www.cs.columbia.edu/~sedwards/presentations/ccu2004.pdf",
   "pagemap": {
    "cse_image": [
     {
      "src": "x-raw-image:///54b64bebe16665867bf6c2965b03fa7116ef6b0137317ac8e3e64e314a0b600a"
     }
    ],
    "cse_thumbnail": [
     {
      "width": "211",
      "height": "239",
      "src": "https://encrypted-tbn2.gstatic.com/images?q=tbn:ANd9GcSFwWtVT_MvYTqyJXpdjBC3M7oVVJXdHmyT9VE0WcaKqfC0wlnwuYLSB7Pj"
     }
    ],
    "metatags": [
     {
      "producer": "GNU Ghostscript 7.05"
     }
    ]
   }
  }
 ]
}

As you can see gum guards according to this api has a correlation with squared numbers.


回答1:


From here: enter link description here

Search the entire web.

This article applies only to free basic custom search engines. You can't set Google Site Search to search the entire web.

If you have a basic custom search engine, you can set it to search the entire web. Note that results may not match the results you'd get by searching on Google Web Search. If you do set your search engine to search the entire web, you won't be able to use on-demand indexing.

Convert a search engine to search the entire web:

  1. On the Custom Search home page, click the search engine you want.
  2. Click Setup, and then click the Basics tab.
  3. Select Search the entire web but emphasize included sites.
  4. In the Sites to search section, delete the site you entered during the initial setup process.


来源:https://stackoverflow.com/questions/22732817/how-does-google-custom-search-search-the-web-and-why-does-it-like-so-much-domai

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!