How to extract a subset from a CSV file using NiFi

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-02 06:54:19

问题


I have a csv file say with 100+ columns and I want to extract only specific 60 columns as a subset(both column name + its value). I know we can use Extract Text processors. Can anyone tell me what regular expression to write? Ex- Lets say from the given snapshot I only want NiFi to Extract 'BMS_sw_micro', 'BMU_Dbc_Dbg_Micro', 'BMU_Dbc_Fia_Micro' columns i.e. Extract only column 'F,L,O'.

any help is much appreciated!


回答1:


As I said in the comment, you can Count the number of commas before the text, you want to match and use that in the RegEx, like this:

/(?<=^([^,]+?,){5})[^,]+/

What the RegEx do is, it starts from left of string and Counts the number of commas, before it matches text between 2 commas.

The number in the curly braces defines what column to match (how many commas to skip).

You run the RegEx once for every column, you want, specifying the column number.




回答2:


See my answer to this SO question to your related question about selecting CSV columns.



来源:https://stackoverflow.com/questions/52337739/how-to-extract-a-subset-from-a-csv-file-using-nifi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!