Get data from json object inside csv using apache-nifi

家住魔仙堡 提交于 2021-01-29 10:12:58

问题


My csv contain

date,name,department
2020-2-4,sachith,{dep_name:computer,location:2323,3434}
2020-2-5,nalaka,{dep_name:engineering,location:3343,5454}

final csv should be like :

date,name,dep_name,lat,lot
2020-2-4,sachith,computer,2323,3434
2020-2-5,nalaka,engineering,3343,5454

here lat,lot are taken from location:3343,5454 data.

I have tried to use UpdateRecord processor for this. In it has some ${field.value:join(','):substringAfter('dep_name:')}

But its not working. How can I complete this using apache-nifi?


回答1:


plain groovy to test script in groovyConsole:

import groovy.json.*

def parser = new JsonSlurper().setType(JsonParserType.LAX) //LAX to accept strings without double-quotes

def w = System.out
new StringReader('''date,name,department
2020-2-4,sachith,{"dep_name":"computer","location":"2323,3434"}
2020-2-5,nalaka,{"dep_name":"engineering","location":"3343,5454"}''').withReader{r->
    r.eachLine{line, lineNum->
        if(lineNum==1){
            w<<line<<',lon,lat'<<'\n'
        }else{
            def row=line.split(',')          //split line by coma
            def json=row[2..-1].join(',')    //join back to string starting from 3rd element
            json = parser.parseText(json)
            w<<"${row[0]},${row[1]},${json.dep_name},${json.location}"<<'\n'
        }
    }
}

now the same script modified for nifi ExecuteGroovyScript processor:

import groovy.json.*

def ff=session.get()
if(!ff)return

def parser = new JsonSlurper().setType(JsonParserType.LAX)

ff.write{streamIn,streamOut->
    streamIn.withReader('UTF-8'){r->      //convert in stream to reader
        streamOut.withWriter('UTF-8'){w-> //convert out stream to writer
            //go line by line
            r.eachLine{line, lineNum->
                if(lineNum==1){
                    w<<line<<',lon,lat'<<'\n'        //for the first line just add some headers
                }else{
                    def row=line.split(',')          //split line by coma
                    def json=row[2..-1].join(',')    //join back to string starting from 3rd element
                    json = parser.parseText(json)
                    w<<"${row[0]},${row[1]},${json.dep_name},${json.location}"<<'\n'
                }
            }
        }
    }
}
REL_SUCCESS<<ff



来源:https://stackoverflow.com/questions/60092622/get-data-from-json-object-inside-csv-using-apache-nifi

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!