How to continously update target file using Luigi?

女生的网名这么多〃 提交于 2021-02-08 19:54:19

问题


I have recently started playing around with Luigi, and I would like to find out how to use it to continuously append new data into an existing target file.

Imagine I am pinging an api every minute to retrieve new data. Because a Task only runs if the Target is not already present, a naive approach would be to parameterize the output file by the current datetime. Here's a bare bones example:

import luigi
import datetime

class data_download(luigi.Task):
    date = luigi.DateParameter(default = datetime.datetime.now()) 

    def requires(self):
        return []

    def output(self):
        return luigi.LocalTarget("data_test_%s.json" % self.date.strftime("%Y-%m-%d_%H:%M"))

    def run(self):
        data = download_data()
        with self.output().open('w') as out_file:
            out_file.write(data + '\n')

if __name__ == '__main__':
    luigi.run()

If I schedule this task to run every minute, it will execute because the target file of the current time does not exist yet. But it creates 60 files a minute. What I'd like to do instead, is make sure that all the new data ends up in the same file eventually. What would be a scalable approach to accomplish that? Any ideas, suggestions are welcome!


回答1:


You cannot. As the doc for LocalTarget says:

Parameters: mode (str) – the mode r opens the FileSystemTarget in read-only mode, whereas w will open the FileSystemTarget in write mode. Subclasses can implement additional options.

I.e. only r or w modes are allowed. Additional options such as a require an extension of the LocalTarget class; despite it breaks the desired idempotency on Luigi task executions.




回答2:


def output(self):
        return luigi.LocalTarget("data_test_%s.json" % self.date.strftime("%Y-%m-%d_%H:%M"))

It's not the 'luigi way', but it does the job. In the end those targets are just file objects.



来源:https://stackoverflow.com/questions/42960735/how-to-continously-update-target-file-using-luigi

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!