search-engine

Elastic search - integrate with java web application

匆匆过客 提交于 2019-12-03 14:20:26
I am developing a java web application (ERP system). I have completed basic flows. Now as per my client requirement, we need to implement few search options. (i.e. Employees, Users, Invoices, Inventory, etc.) I am planning to implement a search engine for this. I feel Elastic search is good option for my search (Please suggest me, if any other good options). Please suggest me some good documentation, on how to integrate Elastic search with a java( Spring+Hibernate ) web application. (Point me to right place, if I am asking any repeated question.) I don't think there is yet really any tutorial.

Multithreaded search operation

时光怂恿深爱的人放手 提交于 2019-12-03 14:04:51
问题 I have a method that takes an array of queries, and I need to run them against different search engine Web API's, such as Google's or Yahoo's. In order to parallelize the process, a thread is spawned for each query, which are then joined at the end, since my application can only continue after I have the results of every query. I currently have something along these lines: public abstract class class Query extends Thread { private String query; public abstract Result[] querySearchEngine();

Are HTML Meta Tags still important? [closed]

半腔热情 提交于 2019-12-03 13:34:55
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 8 years ago . I read some articles on Internet, some said that search engine like Google and Bing don't care about HTML Meta Tags any more. Should I still need to maintain the HTML Meta Tags in my website properly? Thanks! 回答1: Are meta tags critical? Every search engines emphasizes meta tags differently. Google doesn't let

Sitemap for a site with a large number of dynamic subdomains

房东的猫 提交于 2019-12-03 13:33:43
问题 I'm running a site which allows users to create subdomains. I'd like to submit these user subdomains to search engines via sitemaps. However, according to the sitemaps protocol (and Google Webmaster Tools), a single sitemap can include URLs from a single host only. What is the best approach? At the moment I've the following structure: Sitemap index located at example.com/sitemap-index.xml that lists sitemaps for each subdomain (but located at the same host). Each subdomain has its own sitemap

Visual similarity search algorithm

徘徊边缘 提交于 2019-12-03 13:12:35
I'm trying to build a utility like this http://labs.ideeinc.com/multicolr , but I don't know which algorithm they are using, Does anyone know? johnnycrash All they are doing is matching histograms. So build a histogram for your images. Normalize the histograms by size of image. A histogram is a vector with as many elements as colors. You don't need 32,24, and maybe not even 16 bits of accuracy and this will just slow you down. For performance reasons, I would map the histograms down to 4, 8, and 10-12 bits. Do a fuzzy least distance compare between the all the 4 bit histograms and your sample

What does google.setOnLoadCallback(initialize) function exactly mean?

会有一股神秘感。 提交于 2019-12-03 10:30:32
问题 While coding JavaScript and Ajax, there is no proper documentation for this function. I searched this term using api src="http://www.google.com/jsapi" and searchControl.execute("abhilashm86"); . How is this google.setOnLoadCallback(initialize) called internally? Is this function just for a new search term when the user clears previous search and starts a new one? How exactly does google.setOnLoadCallback(initialize) get trigerred? 回答1: Your initialize function will be called when your

how does spider in a search engine works?

大憨熊 提交于 2019-12-03 10:06:37
How does crawler or spider in a search engine works Specifically, you need at least some of the following components: Configuration: Needed to tell the crawler how, when and where to connect to documents; and how to connect to the underlying database/indexing system. Connector: This will create the connections to a web page or a disk share or anything, really. Memory: The pages already visited must be known to the crawler. This is usually stored in the index but it depends on the implementation and the needs. The content is also hashed for de-duplication and updates validation purposes. Parser

How can I use the Twitter Search API to return all tweets that match my search query, posted only within the last five seconds?

北慕城南 提交于 2019-12-03 10:03:26
问题 I would like to use the API to return all tweets that match my search query, but only tweets posted within the last five seconds. With Twitter's Search API, I can use the since_id to grab all tweets from a specific ID. However, I can't really see a good way to find the tweet ID to begin from. I'm also aware that you can use "since:" in the actual query to use a date, but you cannot enter a time. Can someone with Twitter API experience offer me any advice? Thanks for reading and your time!

how to make a search engine for website? [closed]

心不动则不痛 提交于 2019-12-03 08:19:41
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 4 years ago . I want to have a search engine for my website, is any of these web search engines(like Google,yahoo,etc) provide a free service? Or I should do it by myself 回答1: Zend_Search_Lucene is a fully implemented and fast PHP based fulltext search engine. You'll have to index your own data

Solr associations

此生再无相见时 提交于 2019-12-03 07:41:29
The last couple of days we are thinking of using Solr as our search engine of choice. Most of the features we need are out of the box or can be easily configured. There is however one feature that we absolutely need that seems to be well hidden (or missing) in Solr. I'll try to explain with an example. We have lots of documents that are actually businesses: <document> <name>Apache</name> <cat>1</cat> ... </document> <document> <name>McDonalds</name> <cat>2</cat> ... </document> In addition we have another xml file with all the categories and synonyms: <cat id=1> <name>software</name> <synonym