why TimedRotatingFileHandler does not delete old files?

青春壹個敷衍的年華 提交于 2020-01-14 14:12:46

问题


I am using TimedRotatingFileHandler to create my logs. I want my log files to be created every minute, keep at most 2 log files and delete older ones. Here is the sample code:

import logging
import logging.handlers
import datetime

logger = logging.getLogger('MyLogger')
logger.setLevel(logging.DEBUG)

handler = logging.handlers.TimedRotatingFileHandler(
    "logs/{:%H-%M}.log".format(datetime.datetime.now()), 
    when="M", 
    backupCount=2)

logger.addHandler(handler)
logger.debug("PLEASE DELETE PREVIOUS FILES")

If I run this code multiple times (with a minute interval) I get multiple files in my logs directory like so:

21-01.log
21-02.log
21-03.log
...

This seems strange to me, since I set backupCount=2 which indicates that at most 2 files should be saved and older files should be deleted. However when I start my application with 2 files or more in the log folder, old files are not deleted.

Why TimedRotatingFileHandler does not delete old files? Is there any way I can set TimedRotatingFileHandler to delete older files?


回答1:


You can't use TimedRotatingFileHandler, as designed, for your use case. The handler expects the "current" log file name to remain stable and defines rotating as moving existing logging files to a backup by renaming. These are the backups that are kept or deleted. The rotation backups are created from the base filename plus a suffix with the rotation timestamp. So the implementation distinguishes between the log file (stored in baseFilename) and rotation files (generated in the doRotate() method. Note that backups are only deleted when rotation takes place, so after the handler has been in use for at least one full interval.

You instead want the base filename itself to carry the time information, and so are varying the log file name itself. There are no 'backups' in this scenario, you simply open a new file at rotation moments. Moreover, you appear to be running short-lived Python code, so you want older files removed immediately, not just when explicitly rotating, which may never be reached.

This is why TimedRotatingFileHandler won't delete any files, because *it never gets to create backup files. No backups means there is no backups to remove. To rotate files, current implementation of the handler expects to be in charge of filename generation, and can't be expected to know about filenames it would not itself generate. When you configure it with the "M" per-minute rotation frequency, it is configured to rotate files to backup files with the pattern {baseFileame}.{now:%Y-%m-%d_%H_%M}, and so will only ever delete rotated backup files that match that pattern. See the documentation:

The system will save old log files by appending extensions to the filename. The extensions are date-and-time based, using the strftime format %Y-%m-%d_%H-%M-%S or a leading portion thereof, depending on the rollover interval.

Instead, what you want is a base filename that itself carries the timestamp, and that on opening a new log file with a different name that older log files (not backup files) are deleted. For this you'd have to create a custom handler.

Luckily, the class hierarchy is specifically designed for easy customisation. You can subclass BaseRotatingHandler here, and provide your own deletion logic:

import os
import time
from itertools import islice
from logging.handlers import BaseRotatingHandler, TimedRotatingFileHandler

# rotation intervals in seconds
_intervals = {
    "S": 1,
    "M": 60,
    "H": 60 * 60,
    "D": 60 * 60 * 24,
    "MIDNIGHT": 60 * 60 * 24,
    "W": 60 * 60 * 24 * 7,
}

class TimedPatternFileHandler(BaseRotatingHandler):
    """File handler that uses the current time in the log filename.

    The time is quantisized to a configured interval. See
    TimedRotatingFileHandler for the meaning of the when, interval, utc and
    atTime arguments.

    If backupCount is non-zero, then older filenames that match the base
    filename are deleted to only leave the backupCount most recent copies,
    whenever opening a new log file with a different name.

    """

    def __init__(
        self,
        filenamePattern,
        when="h",
        interval=1,
        backupCount=0,
        encoding=None,
        delay=False,
        utc=False,
        atTime=None,
    ):
        self.when = when.upper()
        self.backupCount = backupCount
        self.utc = utc
        self.atTime = atTime
        try:
            key = "W" if self.when.startswith("W") else self.when
            self.interval = _intervals[key]
        except KeyError:
            raise ValueError(
                f"Invalid rollover interval specified: {self.when}"
            ) from None
        if self.when.startswith("W"):
            if len(self.when) != 2:
                raise ValueError(
                    "You must specify a day for weekly rollover from 0 to 6 "
                    f"(0 is Monday): {self.when}"
                )
            if not "0" <= self.when[1] <= "6":
                raise ValueError(
                    f"Invalid day specified for weekly rollover: {self.when}"
                )
            self.dayOfWeek = int(self.when[1])

        self.interval = self.interval * interval
        self.pattern = os.path.abspath(os.fspath(filenamePattern))

        # determine best time to base our rollover times on
        # prefer the creation time of the most recently created log file.
        t = now = time.time()
        entry = next(self._matching_files(), None)
        if entry is not None:
            t = entry.stat().st_ctime
            while t + self.interval < now:
                t += self.interval

        self.rolloverAt = self.computeRollover(t)

        # delete older files on startup and not delaying
        if not delay and backupCount > 0:
            keep = backupCount
            if os.path.exists(self.baseFilename):
                keep += 1
                delete = islice(self._matching_files(), keep, None)
                for entry in delete:
                    os.remove(entry.path)

        # Will set self.baseFilename indirectly, and then may use
        # self.baseFilename to open. So by this point self.rolloverAt and
        # self.interval must be known.
        super().__init__(filenamePattern, "a", encoding, delay)

    @property
    def baseFilename(self):
        """Generate the 'current' filename to open"""
        # use the start of *this* interval, not the next
        t = self.rolloverAt - self.interval
        if self.utc:
            time_tuple = time.gmtime(t)
        else:
            time_tuple = time.localtime(t)
            dst = time.localtime(self.rolloverAt)[-1]
            if dst != time_tuple[-1] and self.interval > 3600:
                # DST switches between t and self.rolloverAt, adjust
                addend = 3600 if dst else -3600
                time_tuple = time.localtime(t + addend)
        return time.strftime(self.pattern, time_tuple)

    @baseFilename.setter
    def baseFilename(self, _):
        # assigned to by FileHandler, just ignore this as we use self.pattern
        # instead
        pass

    def _matching_files(self):
        """Generate DirEntry entries that match the filename pattern.

        The files are ordered by their last modification time, most recent
        files first.

        """
        matches = []
        pattern = self.pattern
        for entry in os.scandir(os.path.dirname(pattern)):
            if not entry.is_file():
                continue
            try:
                time.strptime(entry.path, pattern)
                matches.append(entry)
            except ValueError:
                continue
        matches.sort(key=lambda e: e.stat().st_mtime, reverse=True)
        return iter(matches)

    def doRollover(self):
        """Do a roll-over. This basically needs to open a new generated filename.
        """
        if self.stream:
            self.stream.close()
            self.stream = None

        if self.backupCount > 0:
            delete = islice(self._matching_files(), self.backupCount, None)
            for entry in delete:
                os.remove(entry.path)

        now = int(time.time())
        rollover = self.computeRollover(now)
        while rollover <= now:
            rollover += self.interval
        if not self.utc:
            # If DST changes and midnight or weekly rollover, adjust for this.
            if self.when == "MIDNIGHT" or self.when.startswith("W"):
                dst = time.localtime(now)[-1]
                if dst != time.localtime(rollover)[-1]:
                    rollover += 3600 if dst else -3600
        self.rolloverAt = rollover

        if not self.delay:
            self.stream = self._open()

    # borrow *some* TimedRotatingFileHandler methods
    computeRollover = TimedRotatingFileHandler.computeRollover
    shouldRollover = TimedRotatingFileHandler.shouldRollover

Use this with time.strftime() placeholders in the log filename, and those will be filled in for you:

handler = TimedPatternFileHandler("logs/%H-%M.log", when="M", backupCount=2)

Note that this cleans up older files when you create the instance.




回答2:


Like you can see in TimedRotatingFileHandler documentation, your log filename should be the same to get properly the rotating system.

In your case, because YOU are yourself appending dateTime information, the log filename is different each time, and so you observe the result.

So, in your source code, you just need to adapt the log filename:

handler = logging.handlers.TimedRotatingFileHandler(
    "logs/MyLog", 
    when="M", 
    backupCount=2)

If you want to challenge it, you can change the when to "S" (seconds), and check the rotation is OK.

For instance, it will produce such files automagically:

> MyLog
> MyLog.2019-07-08_11-36-53
> MyLog.2019-07-08_11-36-58

Don't hesitate if you need additional information.




回答3:


As other people already indicated the backupCount will work only if you always log to the file with the same filename and then rotate every now and then. Then you will have log files like @Bsquare indicated. However in my case, I needed to rotate every day and have my log files have the following names: 2019-07-06.log, 2019-07-07.log, 2019-07-07.log, ... I found out that it is not possible to do using current implementation of TimedRotatingFileHandler

So I ended up creating my own deleting functionality that suits my needs on top of FileHandler This is a simple example of a logger class that uses FileHandler and will make sure old logs files are deleted every time you create an instance of this class:

import os
import datetime
import logging
import re
import pathlib


class Logger:

    # Maximum number of logs to store
    LOGS_COUNT = 3

    # Directory to log to
    LOGS_DIRECTORY = "logs"

    def __init__(self):
        # Make sure logs directory is created
        self.__create_directory(Logger.LOGS_DIRECTORY)
        # Clean old logs every time you create a logger
        self.__clean_old_logs()
        self.logger = logging.getLogger("Logger")
        # If condition will make sure logger handlers will be initialize only once when this object is created
        if not self.logger.handlers:
            self.logger.setLevel(logging.INFO)
            formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
            file_handler = logging.FileHandler("logs/{:%Y-%m-%d}.log".format(datetime.datetime.now()))
            file_handler.setFormatter(formatter)
            self.logger.addHandler(file_handler)

    def log_info(self, message):
        self.logger.info(message)

    def log_error(self, message):
        self.logger.error(message)

    def __clean_old_logs(self):
        for name in self.__get_old_logs():
            path = os.path.join(Logger.LOGS_DIRECTORY, name)
            self.__delete_file(path)

    def __get_old_logs(self):
        logs = [name for name in self.__get_file_names(Logger.LOGS_DIRECTORY)
            if re.match("([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))\.log", name)]
        logs.sort(reverse=True)
        return logs[Logger.LOGS_COUNT:]

    def __get_file_names(self, path):
        return [item.name for item in pathlib.Path(path).glob("*") if item.is_file()]

    def __delete_file(self, path):
        os.remove(path)

    def __create_directory(self, directory):
        if not os.path.exists(directory):
            os.makedirs(directory)

And then you would use it like so:

logger = Logger()
logger.log_info("This is a log message")


来源:https://stackoverflow.com/questions/55170348/why-timedrotatingfilehandler-does-not-delete-old-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!