launchd: sleep in GCD managed signal handler

那年仲夏 提交于 2019-12-10 11:39:34

问题


I encounter a strange situation in a launchd managed daemon when I try to sleep in the SIGTERM handler that is managed with Grand Central Dispatch as described here.
Everything works fine and I do get a SIGTERM signal handler before receiving a SIGKILL when I do not sleep in the SIGTERM handler. But as soon as I do sleep -- even for extremly short amounts of time like a usleep(1); -- I do not get a SIGTERM handler at all but instead my daemon is SIGKILLed instantly by launchd.

Btw I am using EnableTransactions in my plist file and the proper vproc_transaction_begin(3)/vproc_transaction_end(3) code as described here.

Not sleeping in the SIGTERM handler is not an option for me because I need to poll information about my "client processess" to know if it is save to end the daemon or not.

It seems to me as if there is some compiler flag responsible for directly receiving the SIGKILL (and not the expected SIGTERM) as soon as I do some sleep in the signal handler because when I do sleep I do not see any outputs of my SIGTERM handler at all. I would however expect to see the debug prints up to the sleep call but this is not the case.

Here is my plist file:

  <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
            <key>Label</key>
            <string>org.example.myTestD</string>
            <key>ProgramArguments</key>
            <array>
                    <string>/usr/sbin/myTestD</string>
            </array>

            <key>RunAtLoad</key>
            <true/>

            <key>KeepAlive</key>
            <true/>

            <key>ExitTimeOut</key>
            <integer>0</integer>

            <key>EnableTransactions</key>
            <true/>
    </dict>
    </plist>

And here is my SIGTERM handler. Please note that I do see any output at all as soon as I add the usleep(1); line.

static void mySignalHandler(int sigraised)
{
    int fd = open("/tmp/myTestD.log", O_WRONLY | O_CREAT | O_APPEND, 0777);
    if (fd <= 0) return;

    dprintf(fd, "%s(): signal received = %d, sleeping some time ...\n", __func__, sigraised);
    usleep(1);
    dprintf(fd, "%s(): ... done\n", __func__);

    dprintf(fd, "%s(): EXITING\n", __func__);
    close(fd);

    // transactionHandle is global variable assigned in daemon main
    if (transactionHandle) vproc_transaction_end(NULL, transactionHandle);

    exit(0);
}

Thank you very much for any hints/answers!

Chris


回答1:


I think the crux of your issue is that you have this in the plist:

        <key>ExitTimeOut</key>
        <integer>0</integer>

The man page for launchd.plist says:

ExitTimeOut <integer>

The amount of time launchd waits between sending the SIGTERM signal and before sending a SIGKILL signal when the job is to be stopped. The default value is system-defined. The value zero is interpreted as infinity and should not be used, as it can stall system shutdown forever.

Experimenting a bit, it appears that this text is not accurate. Empirically, I observe that if that value is set to 0, I get the behavior you're describing (where the process is KILLed immediately after receiving TERM, regardless of any outstanding declared transactions.) If I change this value to some arbitrary larger number like, say, 60, I observe my TERM handler being called and having a chance to do cleanup before exiting.

It's not entirely clear whether you're using the classic signal handling or GCD since the link you posted describes both, but if you're using classic UNIX signal handling, then I should also mention that you've called functions in your signal handler that aren't on the list of functions that are OK to call in signal handlers (dprintf and usleep aren't on the list.) But it seems more likely that you're using GCD.

Another thing that occurs to me is that if you were using vproc_transaction_begin/end to bracket whatever work items you're waiting for in your handler, then you would get this behavior "for free" without needing the signal handler at all. It's completely conceivable that there is some centralized cleanup work that you need to do irrespective of normal work items, but if this is just about waiting for other asynchronous tasks to finish, it could be even simpler.

Anyway, in case it helps, here's the code I used to test this scenario:

#import <Foundation/Foundation.h>

#import <vproc.h>

static void SignalHandler(int sigraised);
static void FakeWork();
static void Log(NSString* str);

int64_t outstandingTransactions;
dispatch_source_t fakeWorkGeneratorTimer;

int main(int argc, const char * argv[])
{
    @autoreleasepool
    {
        // Set up GCD handler for SIGTERM
        dispatch_source_t source = dispatch_source_create(DISPATCH_SOURCE_TYPE_SIGNAL, SIGTERM, 0, dispatch_get_global_queue(0, 0));
        dispatch_source_set_event_handler(source, ^{
            SignalHandler(SIGTERM);
        });
        dispatch_resume(source);

        // Tell the standard signal handling mechanism to ignore SIGTERM
        struct sigaction action = { 0 };
        action.sa_handler = SIG_IGN;
        sigaction(SIGTERM, &action, NULL);

        // Set up a 10Hz timer to generate "fake work" events
        fakeWorkGeneratorTimer = dispatch_source_create(DISPATCH_SOURCE_TYPE_TIMER, 0, 0, dispatch_get_global_queue(0, 0));
        dispatch_source_set_timer(fakeWorkGeneratorTimer, DISPATCH_TIME_NOW, 0.1 * NSEC_PER_SEC, 0.05 * NSEC_PER_SEC);
        dispatch_source_set_event_handler(fakeWorkGeneratorTimer, ^{
            // Dont add an event *every* time...
            if (arc4random_uniform(10) >= 5) dispatch_async(dispatch_get_global_queue(0, 0), ^{ FakeWork(); });
        });
        dispatch_resume(fakeWorkGeneratorTimer);

        // Start the run loop
        while (1)
        {
            // The runloop also listens for SIGTERM and will return from here, so I'm just sending it right back in.
            [[NSRunLoop currentRunLoop] run];
        }
    }

    return 0;
}

static void SignalHandler(int sigraised)
{
    // Open a transaction so that we dont get killed before getting to the end of this handler
    vproc_transaction_t transaction = vproc_transaction_begin(NULL);

    // Turn off the fake work generator
    dispatch_suspend(fakeWorkGeneratorTimer);

    Log([NSString stringWithFormat: @"%s(): signal received = %d\n", __func__, sigraised]);

    int64_t transCount = outstandingTransactions;
    while (transCount > 0)
    {
        Log([NSString stringWithFormat: @"%s(): %lld transactions outstanding. Waiting...\n", __func__, transCount]);
        usleep(USEC_PER_SEC / 4);
        transCount = outstandingTransactions;
    }

    Log([NSString stringWithFormat: @"%s(): EXITING\n", __func__]);

    // Close the transaction
    vproc_transaction_end(NULL, transaction);

    exit(0);
}

static void FakeWork()
{
    static int64_t workUnitNumber;

    const NSTimeInterval minWorkDuration = 1.0 / 100.0; // 10ms
    const NSTimeInterval maxWorkDuration = 4.0; // 4s

    OSAtomicIncrement64Barrier(&outstandingTransactions);
    int64_t serialNum = OSAtomicIncrement64Barrier(&workUnitNumber);
    vproc_transaction_t transaction = vproc_transaction_begin(NULL);

    Log([NSString stringWithFormat: @"Starting work unit: %@", @(serialNum)]);

    // Set up a callback some random time later.
    int64_t taskDuration = arc4random_uniform(NSEC_PER_SEC * (maxWorkDuration - minWorkDuration)) + (minWorkDuration * NSEC_PER_SEC);
    dispatch_after(dispatch_time(DISPATCH_TIME_NOW, taskDuration), dispatch_get_global_queue(0, 0), ^{
        Log([NSString stringWithFormat: @"Finishing work unit: %@", @(serialNum)]);
        vproc_transaction_end(NULL, transaction);
        OSAtomicDecrement64Barrier(&outstandingTransactions);
    });
}

static void Log(NSString* str)
{
    static NSObject* lockObj = nil;
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        lockObj = [NSObject new];
    });

    @synchronized(lockObj)
    {
        int fd = open("/tmp/myTestD.log", O_WRONLY | O_CREAT | O_APPEND, 0777);
        if (fd <= 0) return;
        dprintf(fd, "%s\n", str.UTF8String);
        close(fd);
    }
}

And the plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>Label</key>
        <string>DaemonDeathTest</string>
        <key>ProgramArguments</key>
        <array>
            <string>/tmp/bin/DaemonDeathTest</string>
        </array>

        <key>RunAtLoad</key>
        <true/>

        <key>KeepAlive</key>
        <true/>

        <key>ExitTimeOut</key>
        <integer>60</integer>

        <key>EnableTransactions</key>
        <true/>
    </dict>
</plist>


来源:https://stackoverflow.com/questions/25959056/launchd-sleep-in-gcd-managed-signal-handler

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!