Using Apache NiFi to write CSV files by contents of column

狂风中的少年 提交于 2019-12-11 15:49:20

问题


I have an Apache NiFi flow, where I read in a massive .csv file. Here's a sample .csv:

school, date, city
Vanderbilt, xxxx, xxxx
Georgetown, xxxx, xxxx
Duke, xxxx, xxxx
Vanderbilt, xxxx, xxxx

I want to use NiFi to read the file, and then output another .csv file by school name. I.e. there would be a .csv file of two Vanderbilt records (two lines total, b/c two records), and one file for Georgetown, and one file for Duke.

I've used GetFile to draw in my file (works, verified), and then SplitText (line split count = 1 & header line count = 1), and then ExtractText, but I have a very wrong config in that one. Lastly, I have PutFile, which writes to where I need it to go. Thanks.


回答1:


Take a look at NiFi's record processing capabilities, you will want to use PartitionRecord to partition on the school field, which will produce exactly what you are describing.

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.7.1/org.apache.nifi.processors.standard.PartitionRecord/index.html



来源:https://stackoverflow.com/questions/52611042/using-apache-nifi-to-write-csv-files-by-contents-of-column

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!