apache-nifi

How to split this csv file into multiple contents?

浪尽此生 提交于 2019-12-06 15:24:40
问题 I have CSV File which having below contents, Input.csv Sample NiFi Data demonstration for below Due dates 20-02-2017,23-03-2017 My Input No1 inside csv,,,,,, Animals,Today-20.02.2017,Yesterday-19-02.2017 Fox,21,32 Lion,20,12 My Input No2 inside csv,,,, Name,ID,City Mahi,12,UK And,21,US Prabh,32,LI I need to split above whole csv(Input.csv) into two parts like InputNo1.csv and InputNo2.csv. For InputNo1.csv should have below contents only., Animals,Today-20.02.2017,Yesterday-19-02.2017 Fox,21

How to get ISO string in Nifi getMongo Query Field

十年热恋 提交于 2019-12-06 10:39:56
I'm trying to use expression languge to generate ISO string in Nifi getMongo Query field using following query, { "remindmeDate": { "$gte": "${now():format("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",'GMT')}", "$lte": "${now():toNumber():plus(359999):format("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",'GMT')}" } } But i'm getting invalid JSON error error as double quotes are not escaped. When we try to escape it using \ operator, nifi is not evaluating the expression language. Is there any method or workaround to get this working ? Thanks in advance GetMongo processor of nifi requires your query to be in extended json

Apache NiFi - OutOfMemory Error: GC overhead limit exceeded on SplitText processor

旧巷老猫 提交于 2019-12-06 06:49:47
问题 I am trying to use NiFi to process large CSV files (potentially billions of records each) using HDF 1.2. I've implemented my flow, and everything is working fine for small files. The problem is that if I try to push the file size to 100MB (1M records) I get a java.lang.OutOfMemoryError: GC overhead limit exceeded from the SplitText processor responsible of splitting the file into single records. I've searched for that, and it basically means that the garbage collector is executed for too long

Nifi Encrypt json

主宰稳场 提交于 2019-12-06 06:00:59
I would like to use nifi to encrypt the attributes in a json but not the keys as I would like to upload the data to a mongodb server. Is there a way to do this? For the project I an using twitter data as a proof of concept. So far I have used the EvaluateJsonPath processor to extract only the text of the tweet, and I can encrypt this text, however the resulting json no longer has a key. Can Nifi recreate a json that attaches a key to this attribute that I extracted? Is there a better way to do this? Unfortunately, this workflow isn't well supported by existing Apache NiFi processors. You could

automating NIFI template deployment

∥☆過路亽.° 提交于 2019-12-06 05:16:23
问题 I'm new to nifi and am trying to understand (since it looks many GUI based) if there is a way to automate scale up and down on Nifi and how one would take an xml Nifi template and deploy it to a cluster. Essentially what we're trying to do is use Nifi to collect JMX and log files off of kafka servers as they come up in an automated fashion so logging and JMX counters start flowing to, lets say and Elasticsearch cluster. For example, right now we've automated deployment of the kafka servers

How to map the column wise data in flowfile in NiFi?

拜拜、爱过 提交于 2019-12-06 03:30:45
i have csv file which having following structure., Alfreds,Centro,Ernst,Island,Bacchus Germany,Mexico,Austria,UK,Canada 01,02,03,04,05 Now i have to move that data into database like below. Name,City,ID Alfreds,Germay,01 Centro,Mexico,02 Ernst,Austria,03 Island,UK,04 Bacchus,Canda,05 i try to map those colums but i can't able to extract the data in column wise. Here my input data in column wise but i need to insert those in row wise in SQLServer Can anyone suggest way to transfer column wise data into row wise in sql server?. Thanks Andy There is no existing Apache NiFi processor to perform

DBCPConnectionPool controller service for SQL Server, jdbc exception

青春壹個敷衍的年華 提交于 2019-12-05 17:43:57
NiFi 1.1.1 tested on both Windows 7 and RHEL 7. The background thread is here . I have created a DBCPConnectionPool controller service pointing to a SQL Server db, I am able to fetch data from a table and write it to the local disk(ExecuteSQL -> ConvertAvroToJSON -> PutFile). My code: public byte[] getMaxLSN(Connection connection, String containerDB) { String dbMaxLSN = "{? = CALL sys.fn_cdc_get_max_lsn()}"; byte[] maxLSN = null; try (final CallableStatement cstmt = connection.prepareCall(dbMaxLSN);) { cstmt.registerOutParameter(1, java.sql.JDBCType.BINARY); cstmt.execute(); if (cstmt.getBytes

Configuring HTTP POST request from Nifi

懵懂的女人 提交于 2019-12-05 17:06:36
I am trying to access a WCF service from a REST client. I am sending a POST request from a REST client to a WCF service. For your reference, the detail is as follows. The Service Contract definition is as follows: [ServiceContract] public interface IBZTsoftsensor_WcfService { [OperationContract] [WebInvoke(Method = "POST", RequestFormat = WebMessageFormat.Json, ResponseFormat = WebMessageFormat.Json, BodyStyle = WebMessageBodyStyle.Wrapped, UriTemplate = "/data")] string ExecuteModelJson(string inputModel); } And the implementation of this interface is as follows: public string

Apache Nifi/Cassandra - how to load CSV into Cassandra table

坚强是说给别人听的谎言 提交于 2019-12-05 14:44:29
I have various CSV files incoming several times per day, storing timeseries data from sensors, which are parts of sensors stations. Each CSV is named after the sensor station and sensor id from which it is coming from, for instance "station1_sensor2.csv". At the moment, data is stored like this : > cat station1_sensor2.csv 2016-05-04 03:02:01.001000+0000;0; 2016-05-04 03:02:01.002000+0000;0.1234; 2016-05-04 03:02:01.003000+0000;0.2345; I have created a Cassandra table to store them and to be able to query them for various identified tasks. The Cassandra table looks like this : cqlsh > CREATE

How to specify priority attributes for individual flowfiles?

孤街浪徒 提交于 2019-12-05 11:49:36
I need to use PrioritizeAttributePrioritizer in NiFi. i have observed that prioritizers in below reference. https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings if i receive 10 flowfiles then i need to set the priority value for every flow file to be unique. After that specify queue configuration must be PrioritizeAttributePrioritizer. Then processing flowfiles based on priority value. How can i set priority value for seperate flow files or which prioritizer in Nifi to be work for my case? If the files are named after the time they have been generated (e.g. file_2017-03