search-engine | 易学教程

expressjs node.js serve different data to google/etc bot and human traffic

阅读更多关于 expressjs node.js serve different data to google/etc bot and human traffic

问题 I want to determine if incoming requests are from a bot (eg google, bing), or a human, and serve different data to each, for example, json data for client javascript to construct the site or preprocessed html. Using expressjs, is there an easy way to do this? Thanks. 回答1: I recommend you to response according to the requested MIME type (which is present in the "Accept" header). You can do this with Express this way: app.get('/route', function (req, res) { if (req.is('json')) res.json(data);

How do websites like torrentz.eu collect their content?

阅读更多关于 How do websites like torrentz.eu collect their content?

问题 I would like to know how some search website get their content. I have used in the title the example of 'torrentz.eu' because it has content from several sources. I would like to know what is behind this system; do they 'simply' parse all the website they support and then show the content? Or using some web service? Or both? 回答1: You are looking for the Crawling aspect of Information Retrieval. Basically crawling is: Given an initial set S of websites, try to expand it by exploring the links

Set input field focus on start typing

阅读更多关于 Set input field focus on start typing

问题 I am looking for a way to be able to start typing on a website without having selected anything and then have a specific input field in focus. Google also employs this feature. In their search results you can click anywhere (defocus the search field) and when you start typing it automatically focuses on the search field again. I was thinking about jQuery general onkeyup function to focus on the field, any suggestions? Much appreciated. 回答1: You should bind the keydown event, but unbind it

Why do search engines ignore symbols? [closed]

阅读更多关于 Why do search engines ignore symbols? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 8 years ago . Searching for symbols is a common in programming, especially when you are new to a language. For example, I had a question about the :: operator in Python, and that is not searchable. People looking for things like this or Object [] (array of Objects), would not find what they want. Why do search engines seem to

High level explanation of Similarity Class for Lucene?

阅读更多关于 High level explanation of Similarity Class for Lucene?

问题 Do you know where I can find a high level explanation of Lucene Similarity Class algorithm. I will like to understand it without having to decipher all the math and terms involved with searching and indexing. 回答1: Lucene's built-in Similarity is a fairly standard "Inverse Document Frequency" scoring algorithm. The Wikipedia article is brief, but covers the basics. The book Lucene in Action breaks down the Lucene formula in more detail; it doesn't mirror the current Lucene formula perfectly,

solr vs xapian: which one gives you the most meaningful results?

阅读更多关于 solr vs xapian: which one gives you the most meaningful results?

问题 I am currently using whoosh to dev a website, and I'll need to choose something more powerful once the website will be in production. If anyone of you used both of these engines, which one gave you the most meaningful results one the long road? 回答1: Solr is the best option. Its well documented and the community is huge. Almost a year ago I benchmarked Xapian vs Solr: My dataset had +8000 emails: Solr index time: 3s index size: 5.2mb Xapian index time: 30s index size: 154mb Another great

Building SEO-friendly URLs for accented characters

阅读更多关于 Building SEO-friendly URLs for accented characters

问题 We are making our site an SEO-friendly site by following the pattern below: http://OurWebsite.com/MyArticle/Math/Spain/Glaño As you see, Glaño has a spelling character that search engines may not like it. On the other hand we cannot build up the last URL! Any suggestions to maintain our current URL generation code to handle Spanish or French entries or we need to change our approach? 回答1: Try these functions: function Slug($string, $slug = '-', $extra = null) { return strtolower(trim(preg

Search Combo Box like Google Search

阅读更多关于 Search Combo Box like Google Search

问题 I am making a Windows Form in that I have A combo box, Into which i have loaded some 'Invoice Numbers', from SQL server 2010. I want to Display Invoice Numbers as the User types into the Combo box. For eg if User types '100' then the Invoice Numbers Starting with '100' should be displayed in the dropdown. Please Help, Thanks in Advance... 回答1: DataTable temp; DataTable bank; private void Form1_Load(object sender, EventArgs e) { comboBox1.AutoCompleteMode = AutoCompleteMode.SuggestAppend;

How does Google serve results so fast? [duplicate]

阅读更多关于 How does Google serve results so fast? [duplicate]

问题 This question already has answers here : How can Google be so fast? (19 answers) Closed 6 years ago . Time and again when I search for a topic on Google, Google returns me the results and also prints out some stats like "Results 1 - 10 of about 8,850,000 for j2me. (0.24 seconds)" I notice that the seconds taken by Google to serve the results are in fraction of a second range. How does Google serve pages so fast, what kind of database optimization tricks has it used at its end? 回答1: I think

Building a fast semantic MySQL search engine for private articles from scratch

阅读更多关于 Building a fast semantic MySQL search engine for private articles from scratch

问题 I am working on a project that will involve full-text and semantic searches of articles within the site (if it's not possible to combine it, the user can select either option). These articles are subscription-based and can only be searched after logging in; so they are not accessible to external search engines or their APIs. I read about Sphinx for full text keywords searches (and I intend to implement it for that aspect) but I am not sure how to go about building a semantic search engine out