search-engine | 易学教程

How can I find what search terms (if any) brought a user to my site?

阅读更多关于 How can I find what search terms (if any) brought a user to my site?

问题 I want to create dynamic content based on this. I know it's somewhere, as web analytics engines can get this data to determine how people got to your site (referrer, search terms used, etc.), but I don't know how to get at it myself. 回答1: You can use the "referer" part of the request that the user sent to figure out what he searched for. Example from Google: http://www.google.no/search?q=stack%20overflow So you must search the string (in ASP(.NET) this can be found be looking in Request

How does Google use HTML tags to enhance the search engine?

阅读更多关于 How does Google use HTML tags to enhance the search engine?

问题 I know that Google’s search algorithm is mainly based on pagerank. However, it also does analysis and uses the structure of the document H1 , H2 , title and other HTML tags to enhance the search results. What is the name of this technique "using the document structure to enhance the search results"? And are there any academic papers to help me study this area? The fact that Google is taking the HTML structure into account is well covered in SEO articles however I could not find it in the

Transfer Search Box From Old Website to New Website

阅读更多关于 Transfer Search Box From Old Website to New Website

问题 I am working for this company that has hired me to turn a new home page design of theirs into html and css. In the design they gave me there is a search box in the header that they would like to be same as the one on their current webpage (http://shop.manorfinewares.com/intro.html). I am unsure how to navigate their current page's source code in order to successfully transfer the search box to the new page I am designing for them. Here is the header code that I have so far... CSS: #header{

How can I create a search form that searches files in a folder?

阅读更多关于 How can I create a search form that searches files in a folder?

问题 I have a search feature on my website. I would like it to search through a certain folder for files on my server and display results from there. I'd rather not use databases. Is there a way to do this? 回答1: <?php $dir = "/your_folder_here/"; // Open a known directory, and proceed to read its contents if (is_dir($dir)) { if ($dh = opendir($dir)) { while (($file = readdir($dh)) !== false) { if($file == $_POST['SEARCHBOX_INPUT']){ echo('<a href="'.$dir . $file.'">'. $file .'</a>'."\n"); } }

Exact field search with solr/lucene

阅读更多关于 Exact field search with solr/lucene

问题 I have text field. And for given query I want to find all documents that contains indexed field values. query.contains(document.field_name) Examples: 1. field_name:"a b" 2. field_name:"a b c" For query "a b d" I want to find only first item. Not efficient way to do this is basically generate all substrings of query and index field as a string. Is it possible to implements such requirements in Solr using existen functionality? If not what is the most efficient algorithm/way to do this? PS.

Exact field search with solr/lucene

阅读更多关于 Exact field search with solr/lucene

How to modify search result page given by Solr?

阅读更多关于 How to modify search result page given by Solr?

问题 I intend to make a niche search engine. I am using apache-nutch-1.6 as the crawler and apache-solr-3.6.2 as the searcher. I must say there is very less updated information on web about these technologies. I followed this tutorial http://wiki.apache.org/nutch/NutchTutorial and have successfully installed apache and solr on my ubuntu system. I was also successful in injecting seed url to webdb and perform the crawl. Using solr interface at http://localhost:8983/solr/admin , I can also query the

robots.txt htaccess block google

阅读更多关于 robots.txt htaccess block google

问题 In my .htaccess file I have: <Files ~ "\.(tpl|txt)$"> Order deny,allow Deny from all </Files> This denies any text file from being read, but the Google search engine gives me the following error: robots.txt Status http://mysite/robots.txt 18 minutes ago 302 (Moved temporarily) How can I modify .htaccess to permit Google to read robots.txt while prohibiting everyone else from accessing text files? 回答1: Use this: <Files ~ "\.(tpl|txt)$"> Order deny,allow Deny from all SetEnvIfNoCase User-Agent

ignore accents in elastic search with haystack

阅读更多关于 ignore accents in elastic search with haystack

问题 I am using elasticsearch along with haystack in order to provide search. I want user to search in language other than english. E.g. currently trying with Greek. How can I ignore the accents while searching for anything. E.g. let's say if I enter Ανδρέας ( with accents), its returning results matched with it. But when I enter Ανδρεας, its not returning any results. The search engine should bring any results that have "Ανδρέας" but also "Ανδρεας" as well (the second one is not accented). Can

ignore accents in elastic search with haystack

阅读更多关于 ignore accents in elastic search with haystack