solr | 易学教程

How to connect Spark Streaming to standalone Solr on windows?

阅读更多关于 How to connect Spark Streaming to standalone Solr on windows?

问题 I want to integrate Spark Streaming with Standalone Solr. I am using Spark 1.6.1 and Solr 5.2 standalone on windows with no Zookeeper configuration. I am able to find some solution where they are connecting to Solr from spark by passing the Zookeeper config. How can I connect my spark program to standalone Solr? 回答1: Please see if this example is helpful http://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd From example, you will need to

Using StandardTokenizerFactory with currency

阅读更多关于 Using StandardTokenizerFactory with currency

问题 The fieldType config descrived in this question works for me to detect currency (eg. docs containing "$30" ). However, we wish to use the StandardTokenizerFactory, rather than the WhiteSpaceTokenizerFactory - and this config returns false positives with the StandardTokenizerFactory (eg. docs containing the digits 30 without the $ symbol). What is the solution? Thanks How do I find documents containing digits and dollar signs in Solr? 回答1: Solved via a post to the solr user group http://lucene

JSON returned by Solr

阅读更多关于 JSON returned by Solr

问题 I'm using Solr in order to index my data. Through the Solr's UI I added, in the Schema window, two fields: word, messageid After I made the following query post: curl -X POST -H "Content-Type: application/json" 'http://localhost:8983/solr/messenger/update.json/docs' --data-binary '{"word":"hello","messageid":"23523}' I received the following JSON: { "responseHeader": { "status": 0, "QTime": 55 } } When I'm going to the Query Window in the API and Execute a query without parameters I get the

How to optimize solr indexes

阅读更多关于 How to optimize solr indexes

问题 when i run solr/admin page i got this information, it shows optimize=true, but i have not set optimize=true in configuration file than how it is optimizing the indexes. and how can i set it to false then . Schema Information Unique Key: UID_PK Default Search Field: text numDocs: 2881 maxDoc: 2881 numTerms: 41960 version: 1309429290159 optimized: true current: true hasDeletions: false directory: org.apache.lucene.store.SimpleFSDirectory:org.apache.lucene.store.SimpleFSDirectory@ C:\apache-solr

SolrClient python update document

阅读更多关于 SolrClient python update document

问题 I'm currently trying to create a small python program using SolrClient to index some files. My need is that I want to index some file content and then add some attributes to enrich the document. I used the post command line tool to index the files. Then I use a python program trying to enrich documents, something like this: doc = solr.get('collection', id) doc['new_attribute'] = 'value' solr.index_json('collection',json.dumps([doc])) solr.commit(openSearcher=True) Problem is that I have the

Does CKAN have a limit size of data to upload?

阅读更多关于 Does CKAN have a limit size of data to upload?

问题 I have set CKAN and it is running fine, but have two questions. Both problems below happen only if uploading file . If I add a new resource by a URL, everything runs fine. 1) I can upload small files (around 4kb) to a given dataset, but when trying with bigger files (65 kb) I get Error 500 An Internal Server Error Occurred . So is there a size limit for uploading files? What can I do to be able to upload bigger files? 2) I get another error, for the small uploaded files, and that is: when

Finding Solr documents that intersect with a defined Radius

阅读更多关于 Finding Solr documents that intersect with a defined Radius

问题 We are using Apache Solr 5.x, and we currently have a bunch of defined shapes. Polygons, Circles, etc. These all correspond to a document, each shape of coordinates does. What I want to know is - is it possible to provide a circle , that is - a (lat,lng) pair along with a Radius for that circle - and then find all documents that have an intersection with that circle? I have tried a variety of options, most recently this one: solr_index_wkt:"IsWithin(CIRCLE((149.39999999999998 -34.92 d=0

crawling with Nutch 2.3, Cassandra 2.0, and solr 4.10.3 returns 0 results

阅读更多关于 crawling with Nutch 2.3, Cassandra 2.0, and solr 4.10.3 returns 0 results

问题 I mainly followed the guide on this page. I installed Nutch 2.3, Cassandra 2.0, and solr 4.10.3. Set up went well. But when I executed the following command. No urls were fetched. ./bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/ 2 Below are my settings. nutch-site.xml http://ideone.com/H8MPcl regex-urlfilter.txt +^http://([a-z0-9]*\.)*nutch.apache.org/ hadoop.log http://ideone.com/LnpAw4 I don't see any errors in the log file. I am really lost. Any help would be appreciated.

How to add data to the solr's schema

阅读更多关于 How to add data to the solr's schema

问题 I try to add new data to the solandra according to the solr's schema but I can't find any example about this. My ultimate goal is to integrate solandra with django-solr. What I understand about the insert and updating in the solr based on the original solr and django-solr is to send the new data on the http protocol to the decent path, for example: http://localhost:8983/solandra/wikipedia/update/json However, when I access the url, the browser keep telling me HTTP ERROR: 404 . Can you help me

How to add data to the solr's schema

阅读更多关于 How to add data to the solr's schema