How do I keep more than 5 day's worth of query logs?

问题

In Redshift, there's an STL_QUERY table that stores queries that were run over the last 5 days. I'm trying to find a way to keep more than 5 days worth of records. Here are some things that I've considered:

Is there a Redshift setting for this? It would appear not.
Could I use a trigger? Triggers are not available in Redshift, so this is a no-go.
Could I create an Amazon Data Pipeline job to periodically "scrape" the STL_QUERY table? I could, so this is an option. Unfortunately, I would have to give the pipeline some EC2 instance to use to run this work. It seems like a waste to have an instance sitting around to scrape this table once a day.
Could I use an Amazon Simple Work Flow job to scrape the table? I could, but it suffers from the same issues as 3.

Are there any other options/ideas that I'm missing? I would prefer some other option that does not involve me dedicating an EC2 instance, even if it means paying for an additional service (provided that it's cheaper than the EC2 instance I would have used in it's stead).

回答1:

Keep it simple, do it all in Redshift.

First, use "CREATE TABLE … AS" to save all current history into a permanent table.

CREATE TABLE admin.query_history AS SELECT * FROM stl_query;

Second, using psql to run it, schedule a job on a machine you control to run this every day.

INSERT INTO admin.query_history SELECT * FROM stl_query WHERE query > (SELECT MAX(query) FROM admin.query_history);

Done. :)

Notes:

You need an 8.x version of psql if you haven't set this up yet.
Even if your job doesn't run for a few days stl_query keeps enough history that you'll be covered.
As per your comment, it might be safer to use starttime instead of query as the criteria.

来源：https://stackoverflow.com/questions/19844745/how-do-i-keep-more-than-5-days-worth-of-query-logs

标签

amazon-web-services

amazon-redshift