Slow communication using shared memory between user mode and kernel

不打扰是莪最后的温柔 提交于 2019-12-11 19:35:37

问题


I am running a thread in the Windows kernel communicating with an application over shared memory. Everything is working fine except the communication is slow due to a Sleep loop. I have been investigating spin locks, mutexes and interlocked but can't really figure this one out. I have also considered Windows events but don't know about the performance of that one. Please advice on what would be a faster solution keeping the communication over shared memory possibly suggesting Windows events.

KERNEL CODE

typedef struct _SHARED_MEMORY
{
    BOOLEAN mutex;
    CHAR data[BUFFER_SIZE];
} SHARED_MEMORY, *PSHARED_MEMORY;

ZwCreateSection(...)
ZwMapViewOfSection(...)

while (TRUE) {
    if (((PSHARED_MEMORY)SharedSection)->mutex == TRUE) {
      //... do work...
      ((PSHARED_MEMORY)SharedSection)->mutex = FALSE;
    }
    KeDelayExecutionThread(KernelMode, FALSE, &PollingInterval);
}

APPLICATION CODE

OpenFileMapping(...)
MapViewOfFile(...)

...

RtlCopyMemory(&SM->data, WriteData, Size);
SM->mutex = TRUE;

while (SM->mutex != FALSE) {
    Sleep(1); // Slow and removing it will cause an infinite loop
}

RtlCopyMemory(ReadData, &SM->data, Size);

UPDATE 1 Currently this is the fastest solution I have come up with:

while(InterlockedCompareExchange(&SM->mutex, FALSE, FALSE));

However I find it funny that you need to do an exchange and that there is no function for only compare.


回答1:


You don't want to use InterlockedCompareExchange. It burns the CPU, saturates core resources that might be needed by another thread sharing that physical core, and can saturate inter-core buses.

You do need to do two things:

1) Write an InterlockedGet function and use it.

2) Prevent the loop from burning CPU resources and from taking the mother of all mispredicted branches when it finally gets unblocked.

For 1, this is known to work on all compilers that support InterlockedCompareExchange, at least last time I checked:

__inline static int InterlockedGet(int *val)
{
    return *((volatile int *)val);
}

For 2, put this as the body of the wait loop:

__asm
{
    rep nop
}

For x86 CPUs, this is specified to solve the resource saturation and branch prediction problems.

Putting it together:

while ((*(volatile int *) &SM->mutex) != FALSE) {
    __asm
    {
        rep nop
    }
}

Change int as needed if it's not appropriate.



来源:https://stackoverflow.com/questions/55016251/slow-communication-using-shared-memory-between-user-mode-and-kernel

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!