问题
I have an Apache NiFi flow, where I read in a massive .csv
file. Here's a sample .csv
:
school, date, city
Vanderbilt, xxxx, xxxx
Georgetown, xxxx, xxxx
Duke, xxxx, xxxx
Vanderbilt, xxxx, xxxx
I want to use NiFi to read the file, and then output another .csv
file by school
name. I.e. there would be a .csv
file of two Vanderbilt
records (two lines total, b/c two records), and one file for Georgetown
, and one file for Duke
.
I've used GetFile
to draw in my file (works, verified), and then SplitText
(line split count = 1 & header line count = 1), and then ExtractText
, but I have a very wrong config in that one. Lastly, I have PutFile
, which writes to where I need it to go. Thanks.
回答1:
Take a look at NiFi's record processing capabilities, you will want to use PartitionRecord to partition on the school field, which will produce exactly what you are describing.
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.7.1/org.apache.nifi.processors.standard.PartitionRecord/index.html
来源:https://stackoverflow.com/questions/52611042/using-apache-nifi-to-write-csv-files-by-contents-of-column