Insert 10,000,000+ rows in grails

…衆ロ難τιáo~ 提交于 2019-12-24 07:04:51

问题


I've read a lot of articles recently about populating a grails table from huge data, but seem to have hit a ceiling. My code is as follows:

class LoadingService {
    def sessionFactory
    def dataSource
    def propertyInstanceMap = org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP

    def insertFile(fileName) {
        InputStream inputFile = getClass().classLoader.getResourceAsStream(fileName)
        def pCounter = 1
        def mCounter = 1
        Sql sql = new Sql(dataSource)
        inputFile.splitEachLine(/\n|\r|,/) { line -> 
            line.each { value ->
                if(value.equalsIgnoreCase('0') { 
                    pCounter++
                    return
                }
                sql.executeInsert("insert into Patient_MRNA (patient_id, mrna_id, value) values (${pCounter}, ${mCounter}, ${value.toFloat()})")
                pCounter++
            }
            if(mCounter % 100 == 0) {
                cleanUpGorm()
            }
            pCounter = 1
            mCounter++
        }
    }

    def cleanUpGorm() {
        session.currentSession.clear()
        propertyInstanceMap.get().clear()
    }
}

I have disabled secondary cache, I'm using assigned ids, and I am explicitly handling this many to many relationship through a domain, not the hasMany and belongsTo.

My speed has increased monumentally after applying these methods, but after a while the inserts slow down to the point of almost stopping compared to about 623,000 per minute at the beginning.

Is there some other memory leak that I should be aware of or have I just hit the ceiling in terms of batch inserts in grails?

To be clear it takes about 2 minutes to insert 1.2 million rows, but then they start to slow down.


回答1:


Try doing batch inserts, it's much more efficient

def updateCounts = sql.withBatch { stmt ->
     stmt.addBatch("insert into TABLENAME ...")
     stmt.addBatch("insert into TABLENAME ...")
     stmt.addBatch("insert into TABLENAME ...")
     ...
 }



回答2:


I have fought with this in earlier versions of Grails. Back then I resorted to either simply run the batch manually in proper chunks or use another tool for the batch import, such as Pentaho Data Integration (or other ETL tool or DIY).



来源:https://stackoverflow.com/questions/26417969/insert-10-000-000-rows-in-grails

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!