search-engine | 易学教程

Do search engines process Javascript?

阅读更多关于 Do search engines process Javascript?

问题 According to this page it would seem like they don't, in the sense that they don't actually run it, but that page is 2 years old (judging from the copyright info). The reason I'm asking this question is because we use Javascript to replace text on our site with other more typographically sound content. We're worried that this may affect the crawlability/seo of our sites, since generally what we're replacing is headers; ie. <h1> , <h2> , etc. Will search engine bots see our original code, or

How to prevent search engines from indexing a single page of my website?

阅读更多关于 How to prevent search engines from indexing a single page of my website?

I don't want the search engines to index my imprint page. How could I do that? You need a simple robots.txt file. Basically, it's a text file that tells search engines not to index particular pages. You don't need to include it in the header of your page; as long as it's in the root directory of your website it will be picked up by crawlers. Create it in the root folder of your website and put the following text in: User-Agent: * Disallow: /imprint-page.htm Note that you'd replace imprint-page.html in the example with the actual name of the page (or the directory) that you wish to keep from

Search-Engine Friendly URLs

阅读更多关于 Search-Engine Friendly URLs

问题 I am working on building my first search-engine friendly CMS. I know that perhaps one of the biggest keys to having and SEO site is to have search-engine friendly URLs. So having a link like this: http://www.mysite.com/product/details/page1 will result in much better rankings than one like this: http://www.mysite.com/index.php?pageID=37 I know that to create URLs like the first one, I have one of two options: use a web technology, in this case PHP, to create a directory structure leverage

Google Custom Search: 403 error in iOS

阅读更多关于 Google Custom Search: 403 error in iOS

问题 Google Custom Search is returning this 403 error from my iPhone 7.1 app. This is the response when run in the simulator: { "error": { "errors": [ { "domain": "usageLimits", "reason": "accessNotConfigured", "message": "Access Not Configured. Please use Google Developers Console to activate the API for your project." } ], "code": 403, "message": "Access Not Configured. Please use Google Developers Console to activate the API for your project." } } Is there a flaw in the steps below? I’d like to

SEO and 301 redirects - Can they have relative paths or must they be absolute?

阅读更多关于 SEO and 301 redirects - Can they have relative paths or must they be absolute?

问题 SEO and 301 redirects - Can they have relative paths or must they be absolute? When doing a 301 redirect for a page, are the BOTs/Spiders going to treat a 301 that goes to a relative path (redirect="../") the same as one that goes to an absolute path (redirect="http://www.somewebsite.com/apage/"). For example I have a parent page with content (http://www.somewebsite.com/apage/) on it... I have a subpage (http://www.somewebsite.com/apage/more-details) with further content on it. I plan to move

How does a website highlight search terms you used in the search engine?

阅读更多关于 How does a website highlight search terms you used in the search engine?

问题 I've seen some websites highlight the search engine keywords you used, to reach the page. (such as the keywords you typed in the Google search listing) How does it know what keywords you typed in the search engine? Does it examine the referrer HTTP header or something? Any available scripts that can do this? It might be server-side or JavaScript, I'm not sure. 回答1: This can be done either server-side or client-side. The search keywords are determined by looking at the HTTP Referer (sic)

What is the difference between web-crawling and web-scraping? [duplicate]

阅读更多关于 What is the difference between web-crawling and web-scraping? [duplicate]

This question already has an answer here: crawler vs scraper 4 answers Is there a difference between Crawling and Web-scraping? If there's a difference, what's the best method to use in order to collect some web data to supply a database for later use in a customised search engine? Ben Crawling would be essentially what Google, Yahoo, MSN, etc. do, looking for ANY information. Scraping is generally targeted at certain websites, for specfic data, e.g. for price comparison, so are coded quite differently. Usually a scraper will be bespoke to the websites it is supposed to be scraping, and would

What's the best Django search app? [closed]

阅读更多关于 What's the best Django search app? [closed]

I'm building a Django project that needs search functionality, and until there's a django.contrib.search , I have to choose a search app. So, which is the best? By "best" I mean... easy to install / set up has a Django- or at least Python-friendly API can perform reasonably complex searches Here are some apps I've heard of, please suggest others if you know of any: djangosearch django-sphinx I'd also like to avoid using a third-party search engine (like Google SiteSearch), because some of the data I'd like to index is for site members only and should not be public. kpw Check out Haystack

Azure Search - Find matches within a word like “contains”

阅读更多关于 Azure Search - Find matches within a word like “contains”

问题 I use Azure Search which in turn uses Lucene. Is there any way to make search not that strict. What I need is when searching for " term " should match documents with terms that contain " term ". Serching fox term should match "Prefix Term ", " Term Suffix", "Prefix Term Suffix" Serching fox part2 should match "part1 part2 ", " part2 part3", "part1 part2 part3" I need to run search query which has several terms like "term part2" To match documents like: { someField:"... PrefixTermSuffix ...

How do I save the origin html file with Apache Nutch

阅读更多关于 How do I save the origin html file with Apache Nutch

I'm new to search engines and web crawlers. Now I want to store all the original pages in a particular web site as html files, but with Apache Nutch I can only get the binary database files. How do I get the original html files with Nutch? Does Nutch support it? If not, what other tools can I use to achieve my goal.(The tools that support distributed crawling are better.) Well, nutch will write the crawled data in binary form so if if you want that to be saved in html format, you will have to modify the code. (this will be painful if you are new to nutch). If you want quick and easy solution