apache-nifi

How to clear NiFi queues?

旧城冷巷雨未停 提交于 2019-12-05 11:03:57
We are creating some flows in NiFi and there might be some cases where the queues are being build up but due to some reason the flow doesn't work as expected. At the end of the day, i would like to clear the queues and somehow would like to automate it. The question is how can we delete the queues from backend? Is there any way we can achieve that? In addition to the explicit "Drop Queue" function Bryan mentioned, a couple other features you may be interested are the "Back Pressure" and "FlowFile Expiration" settings on connections. These allow you to automatically control the the amount of

NiFi : Regular Expression in ExtractText gets CSV header instead of data

我的梦境 提交于 2019-12-05 08:29:53
I'm working on a flow where I get CSV files. I want to put the records into different directories based on the first field in the CSV record. For ex, the CSV file would look like this country,firstname,lastname,ssn,mob_num US,xxxx,xxxxx,xxxxx,xxxx UK,xxxx,xxxxx,xxxxx,xxxx US,xxxx,xxxxx,xxxxx,xxxx JP,xxxx,xxxxx,xxxxx,xxxx JP,xxxx,xxxxx,xxxxx,xxxx I want to get the field value of the first field i.e, country. Put those records into a particular directory. US records goes to US directory, UK records goes to UK directory, and so on. The flow that I have right now is: GetFile ----> SplitText (line

Is it possible to remove white spaces from the CSV files header name in NiFi?

一曲冷凌霜 提交于 2019-12-05 08:26:21
I have a CSV file in which some column name have white spaces in it and some column names are without the white space between characters. I want to remove the white spaces from all the header names that has white space in it. Please help. Thank you! Attaching screenshot for reference. Example: 'First Name' I want 'FirstName' I am using ReplaceText processor in which under Search value I have passes \s to search just the header row white spaces and replacement value as Empty string. Also my evaluation mode is 'Line-by-Line'. so now the ouput file is showing as FirstName,LastNameshraddha

Post a NIFI template via REST?

梦想与她 提交于 2019-12-05 07:22:36
I have multiple nifi servers that I would like to be able to POST templates to via the REST interface from a script The "/controller/templates" endpoint appears to be the proper REST endpoint to support POSTing an arbitrary template to my Nifi installation. The "snippetId" field is what is confusing me, how do I determine "The id of the snippet whose contents will comprise the template"? Does anyone have an example of how I can upload a template "test.xml" to my server without having to use the UI? The provided documentation is somewhat confusing, and the solution I worked out was derived from

Custom processor + DBCPConnectionPool for SQL Server : driver jar not loaded

怎甘沉沦 提交于 2019-12-04 17:05:01
I have created a controller service to connect to a test db. I have a custom processor that reads data from SQL Server, the mock tests, the build and the deployment to NiFi succeed. The processor runs into error, maybe the nar dependency scope is at fault or ... ? I am unsure The pom for processor and the nar projects are as follows : processor pom.xml <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

nifi ConvertRecord JSON to CSV getting only single record?

删除回忆录丶 提交于 2019-12-04 14:50:50
问题 I have the below flow set up for reading json data and convert it to csv using the convertRecord processor. However, the output flowfile is only populated with single record (I am assuming only the first record) instead of all the records. Can someone help provide the correct configuration? Source json data: {"creation_Date": "2018-08-19", "Hour_of_day": 7, "log_count": 2136} {"creation_Date": "2018-08-19", "Hour_of_day": 17, "log_count": 606} {"creation_Date": "2018-08-19", "Hour_of_day": 14

Apache NiFi - OutOfMemory Error: GC overhead limit exceeded on SplitText processor

妖精的绣舞 提交于 2019-12-04 10:24:22
I am trying to use NiFi to process large CSV files (potentially billions of records each) using HDF 1.2. I've implemented my flow, and everything is working fine for small files. The problem is that if I try to push the file size to 100MB (1M records) I get a java.lang.OutOfMemoryError: GC overhead limit exceeded from the SplitText processor responsible of splitting the file into single records. I've searched for that, and it basically means that the garbage collector is executed for too long without obtaining much heap space. I expect this means that too many flow files are being generated

How to update line with modified data in Jython?

蓝咒 提交于 2019-12-04 05:05:33
问题 I'm have a csv file which contains hundred thousands of rows and below are some sample lines.., 1,Ni,23,28-02-2015 12:22:33.2212-02 2,Fi,21,28-02-2015 12:22:34.3212-02 3,Us,33,30-03-2015 12:23:35-01 4,Uk,34,31-03-2015 12:24:36.332211-02 I need to get the last column of csv data which is in wrong datetime format. So I need to get default datetimeformat( "YYYY-MM-DD hh:mm:ss[.nnn]" ) from last column of the data. I have tried the following script to get lines from it and write into flow file.

Apache NiFi ExecuteScript: Groovy script to replace Json values via a mapping file

冷暖自知 提交于 2019-12-04 02:18:35
I am working with Apache NiFi 0.5.1 on a Groovy script to replace incoming Json values with the ones contained in a mapping file. The mapping file looks like this (it is a simple .txt): Header1;Header2;Header3 A;some text;A2 I have started with the following: import groovy.json.JsonBuilder import groovy.json.JsonSlurper import java.nio.charset.StandardCharsets def flowFile = session.get(); if (flowFile == null) { return; } flowFile = session.write(flowFile, { inputStream, outputStream -> def content = """ { "field1": "A" "field2": "A", "field3": "A" }""" def slurped = new JsonSlurper()

Programmatically provide NiFi InvokeHTTP different certificates

北城余情 提交于 2019-12-04 02:09:32
问题 I have a requirement in Nifi where I have cycle through different HTTP S REST Endpoints and provide different certificates for some endpoints and different username / password for some other endpoints. I used InvokeHTTP processor to send the requests, although URL takes an expression language, I cannot setup SSLContextService with an expression. Alternatively, I thought on using ExecuteScript to call those Endpoints, however as listed here in StackOverflow post; I still don't know how to