parse Solr xml files to SolrInputDocument

前端 未结 3 610
离开以前
离开以前 2020-12-21 05:20

If I have individual files in the expected Solr format (having just ONE doc per file):


  
    GB18030TEST<         


        
3条回答
  •  天命终不由人
    2020-12-21 05:55

    This is best done programmatically. I know you're looking for a Java solution, but I'd personally recommend groovy.

    The following script processes XML files found in the current directory.

    //
    // Dependencies
    // ============
    import org.apache.solr.client.solrj.SolrServer
    import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer
    import org.apache.solr.common.SolrInputDocument
    
    @Grapes([
        @Grab(group='org.apache.solr', module='solr-solrj', version='3.5.0'),
    ])
    
    //
    // Main
    // =====
    SolrServer server = new CommonsHttpSolrServer("http://localhost:8983/solr/");
    
    new File(".").eachFileMatch(~/.*\.xml/) { 
    
        it.withReader { reader ->
            def xml = new XmlSlurper().parse(reader)
    
            xml.doc.each { 
                SolrInputDocument doc = new SolrInputDocument();
    
                it.field.each {
                    doc.addField(it.@name.text(), it.text())
                }
    
                server.add(doc)
            }
        }
    
    }
    
    server.commit()
    

提交回复
热议问题