google-crawlers

Small preview when sharing link on Social media Ruby On Rails

本小妞迷上赌 提交于 2019-12-22 10:32:12
问题 I'm working on a site whose front end is in angularjs and backend in ROR , Same ROR API is used in an android app also . Now I have a situation here . I need to share my Web's post on the Social media like facebook , twitter and google plus . And along with the link to the single post there should be a small preview also (a preview of the post which crawls before posting e.g in facebook) .I did it using angular Plugins . But when it comes to Android side , what they share and what displays on

Adding a hash prefix at the config phase if it's missing

泄露秘密 提交于 2019-12-22 08:07:44
问题 I am now integrating phantom in my Angularjs-based web application. This fine article says that I should call the $locationProvider.hashPrefix() method to set the prefix to '!' from SEO reasons(allow crawlers to intercept the _escaped_fragment component of the URL). The problem is that I haven't though of the earlier, and some of my URLs look as following: #/home . I though perhaps there is a way that I can implant this '!' char into the begging of the URL programmatically(in case it is not

Any possibility to crawl open web browser data using aperture

佐手、 提交于 2019-12-19 12:25:08
问题 I known about crawl website using Aperture. if i open http://demo.crawljax.com/ in mozila web browser. how can crawl open browser content using Aperture. Steps: 1. Open http://demo.crawljax.com/ on your mozila firefox. 2. Executed java program to crawl open mozila firefox tab. 回答1: It seems you need to crawl the JavaScript/Ajax page. You actually need a crawler like googlebot. See this googlebot can crawl the javascript page. You can do it using some other drivers/crawlers. Here similar

What is the shebang/hashbang for?

China☆狼群 提交于 2019-12-19 10:18:01
问题 Is there any other use for shebangs/hashbangs besides for making AJAX contents crawlable for Google? Or is that it? 回答1: The hash when used in a URL has existed since long before Ajax was invented. It was originally intended as a reference to a sub-section within a page. In this context, you would, for example, have a table of contents at the top of a page, each of which would be a hash link to a section of the same page. When you click on these links, the page scrolls down (or up) to the

how to tell if a web request is coming from google's crawler?

断了今生、忘了曾经 提交于 2019-12-18 08:32:03
问题 From the HTTP server's perspective. 回答1: I have captured google crawler request in my asp.net application and here's how the signature of the google crawler looks. Requesting IP : 66.249.71.113 Client : Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) My logs observe many different IPs for google crawler in 66.249.71.* range. All these IPs are geo-located at Mountain View, CA, USA. A nice solution to check if the request is coming from Google crawler would be to verify

how to tell if a web request is coming from google's crawler?

橙三吉。 提交于 2019-12-18 08:31:53
问题 From the HTTP server's perspective. 回答1: I have captured google crawler request in my asp.net application and here's how the signature of the google crawler looks. Requesting IP : 66.249.71.113 Client : Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) My logs observe many different IPs for google crawler in 66.249.71.* range. All these IPs are geo-located at Mountain View, CA, USA. A nice solution to check if the request is coming from Google crawler would be to verify

how to tell if a web request is coming from google's crawler?

余生长醉 提交于 2019-12-18 08:31:27
问题 From the HTTP server's perspective. 回答1: I have captured google crawler request in my asp.net application and here's how the signature of the google crawler looks. Requesting IP : 66.249.71.113 Client : Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) My logs observe many different IPs for google crawler in 66.249.71.* range. All these IPs are geo-located at Mountain View, CA, USA. A nice solution to check if the request is coming from Google crawler would be to verify

Avoid crawling part of a page with “googleoff” and “googleon”

我们两清 提交于 2019-12-18 05:43:42
问题 I am trying to tell Google and other search engines not to crawl some parts of my web page. What I do is: <!--googleoff: all--> <select name="ddlCountry" id="ddlCountry"> <option value="All">All</option> <option value="bahrain">Bahrain</option> <option value="china">China</option> </select> <!--googleon: all--> After I uploaded the page, I noticed that search engines are stilling rendering elements within the googleoff markup. Am I doing something wrong? 回答1: "googleon" and "googleoff" are

Display an article rating in Google search results

[亡魂溺海] 提交于 2019-12-18 03:10:31
问题 Im writing a review site where the community rates posts. I have noticed that Google can pick up on this ratings and display them in its search results. Does anyone know how this is achieved? An example is a review site like IGN, where in their screen shot below they have indicated their review has a rating of 9.3/10. How can I indicate to Google my own review rating? Maybe some sort of custom meta tag or something. 回答1: You can do that with a Span class. Check Google's Structure Data guide

Should I list PDFs in my sitemap file?

做~自己de王妃 提交于 2019-12-12 10:29:39
问题 Should I add PDFs to my XML sitemap? I want to know if Google will crawl the PDFs. 回答1: Yes, Google crawls PDFs. See the Search Console Help article for a list of file types indexed. 回答2: Absolutely! If you have "separate pages" that make up the whole of your website, you would be better off to include these pages on your XML site map. Remember that the purpose of the XML site map is to help search engines understand what content is available on your website - PDF's included! Don't forget to