Recurrent machine learning ETL using Luigi
问题 Today, running the machine learning job I've written is done by hand. I download the needed input files, learn and predict things, output a .csv file, which I then copy into a database. However, since this is going into production, I need to automate all this process. The needed input files will arrive every month (and eventually more frequently) into a S3 bucket from the provider. Now I'm planning using Luigi to solve this problem. Here is the ideal process: Every week (or day, or hour,