apache-nifi

Nested JSON with duplicate keys

拟墨画扇 提交于 2020-03-23 12:05:02
问题 I will have to process 10 billion Nested JSON records per day using NiFi (version 1.9). As part of the job, am trying to convert the nested JSON to csv using Groovy script. I referred the below Stack Overflow questions related to the same topic and came up with the below code. Groovy collect from map and submap how to convert json into key value pair completely using groovy But am not sure how to retrieve the value of duplicate keys. Sample json is defined in the variable "json" in the below

Nested JSON with duplicate keys

北慕城南 提交于 2020-03-23 12:04:20
问题 I will have to process 10 billion Nested JSON records per day using NiFi (version 1.9). As part of the job, am trying to convert the nested JSON to csv using Groovy script. I referred the below Stack Overflow questions related to the same topic and came up with the below code. Groovy collect from map and submap how to convert json into key value pair completely using groovy But am not sure how to retrieve the value of duplicate keys. Sample json is defined in the variable "json" in the below

org.apache.nifi.processor.exception.FlowFileHandlingException: is not known in this session) in <script> at line number 32

吃可爱长大的小学妹 提交于 2020-03-05 04:25:07
问题 I have written a python script to just compare to strings and putting attribute in the flowfile. But while handling exceptions, I am getting below error always and my flowfile is stuck in the queue with ExecuteScript Processor with below exception: 2020-02-07 18:15:26,049 ERROR [Timer-Driven Process Thread-7] o.a.nifi.processors.script.ExecuteScript ExecuteScript[id=0fdb3d0f-b361-3b31-faf8-fce2dc707591] ExecuteScript[id=0fdb3d0f-b361-3b31-faf8-fce2dc707591] failed to process due to org.apache

NIFI how to change uuid to file name

ぐ巨炮叔叔 提交于 2020-02-25 00:39:17
问题 I have some documents in XML format load into Marklogic. The PutMarkLogic URI attribute Name property default "uuid". How can I change it to file name. Input Directory: /input/ac01010.xml /input/ac02010.xml .... I have two processors below GetFile ->PutMarkLogic Want Marklogic display documents: ac01010.xml ac02010.xml Thanks Andy and Ben. I have updated UpdateAttribute and PutMarkLogic properties in Nifi. it works. UpdateAttribute:added ${filename} PutMarkLogic Property: 回答1: You can use an

How to use 'DBCPConnectionPoolLookup' controllor service in 'ExecuteGroovyScript' processor

こ雲淡風輕ζ 提交于 2020-02-23 07:25:08
问题 I want to access multiple databases depending on the 'database.name' attribute sent in the input flowfile to ExecuteGroovyStript processor. In 'ExecuteGroovyStript' processor I have a property 'SQL.clientdb' which point to 'lookup' service. At the same time I have commissioned a 'DBCPConnectionPool' service with all the required details and its 'name' property similar to value of 'database.name'. The way in which I'm trying to access the pool service is: def clientDb = SQL.clientdb

Caching file content inside ExecuteScript processor of Apache NiFi

旧城冷巷雨未停 提交于 2020-01-25 07:06:55
问题 I have an ExecuteScript processor that does an XML flow file validation against schematron. I'd like the content of the schematron file to be cached somewhere rather than read from the disk for every flow file again and again. What is the best option for doing this? Do I need yet another script that puts the content of the schematron into context.stateManager or PutDistributedMapCache or what? 回答1: I was about to answer NO but it seems that it is possible . You are able to cache variables

Update csv value using executescript processor fails in apache-nifi

放肆的年华 提交于 2020-01-25 06:58:24
问题 I try to read from a flowfile and update a record value using default value in csv. To that I have used ExecuteScript processor with following python code in it. import sys import re import traceback from org.apache.commons.io import IOUtils from org.apache.nifi.processor.io import StreamCallback from org.python.core.util import StringUtil from java.lang import Class from java.io import BufferedReader from java.io import InputStreamReader from java.io import OutputStreamWriter flowfile =

Merge two schemas into one in Apache nifi

喜你入骨 提交于 2020-01-24 19:08:00
问题 I'm trying to merge two csv files into a json using Apache nifi. Two csv's are persons.csv containing information about people: Id|Name|Surname ABC-123|John|Smith ABC-111|Allan|Wood ABC-001|Grace|Kelly And the second csv contains list of events these people have attended: EId|PId|Date|Desc 1|ABC-123|2017-05-01|"Groove party" 2|ABC-111|2017-06-01|"Snack No. One" 3|ABC-123|2017-06-01|"The night out" I'm using a flow of (Nifi flow on git hub): GetFile UpdateAttribute (schema.name) Split Records

Merge two schemas into one in Apache nifi

僤鯓⒐⒋嵵緔 提交于 2020-01-24 19:07:13
问题 I'm trying to merge two csv files into a json using Apache nifi. Two csv's are persons.csv containing information about people: Id|Name|Surname ABC-123|John|Smith ABC-111|Allan|Wood ABC-001|Grace|Kelly And the second csv contains list of events these people have attended: EId|PId|Date|Desc 1|ABC-123|2017-05-01|"Groove party" 2|ABC-111|2017-06-01|"Snack No. One" 3|ABC-123|2017-06-01|"The night out" I'm using a flow of (Nifi flow on git hub): GetFile UpdateAttribute (schema.name) Split Records

Merge two JSON flowfile together in NiFi

余生颓废 提交于 2020-01-24 09:55:06
问题 i want to merge two flowfile that contain JSON object by same specified attribute... flow1: attribute: xuuid = 123456 content: { "sname":"jack", "id":"00001", "state":"NY" } flow2: attribute: xuuid = 123456 content: { "country":"US", "date":"1983" } and i expect this form of data in single output flow: desired_flow: attribute: xuuid = 123456 content: { "sname":"jack", "id":"00001", "state":"NY", "country":"US", "date":"1983" } how do i play with this? MergeContent processor or MergeRecord? i