How to run streaming query on updated lines in CSV file?

后端 未结 2 825
甜味超标
甜味超标 2020-12-11 23:19

I have one csv file in a folder that is keep on updating continuously. I need to take inputs from this csv file and produce some transactions. How can I take data from the c

2条回答
  •  粉色の甜心
    2020-12-11 23:53

    I have 1 csv file in 1 folder location that is keep on updating everytime. i need to take inputs from this csv file and produce some transactions. how can i take data from csv file that is keep on updating , lets say every 5 minutes.

    tl;dr It won't work.

    Spark Structured Streaming by default monitors files in a directory and for every new file triggers a computation. Once a file has been processed, the file will never be processed again. That's the default implementation.

    You could write your own streaming source that could monitor a file for changes, but that's a custom source development (which in most cases is not worth the effort yet doable).

提交回复
热议问题