Allocating a Thread's Stack on a specific NUMA memory

狂风中的少年 提交于 2020-01-02 21:51:53

问题


I would like to know if there is a way to create the stack of a thread on a specific NUMA node. I have written this code but i'm not sure if it does the trick or not.

pthread_t thread1;

int main(int argc, char**argv) {        
  pthread_attr_t attr;
  pthread_attr_init(&attr);

  char** stackarray;
  int numanode = 1;

  stackarray = (char**) numa_alloc_onnode(sizeof(char*), numanode);
  // considering that the newly 
  // created thread will be running on a core on node1

  pthread_attr_setstack(&attr, stackarray[0], 1000000);
  pthread_create(&thread1, &attr, function, (void*)0);

  ...
  ...
}

Thank you for your help


回答1:


Here's the code I use for this (slightly adapted to remove some constants defined elsewhere). Note that I first create the thread normally, and then call the SetAffinityAndRelocateStack() below from within the thread. I think this is much better than trying to create your own stack, since stacks have special support for growing in case the bottom is reached.

The code can also be adapted to operate on the newly created thread from outside, but this could give rise to race conditions (e.g. if the thread performs I/O into its stack), so I wouldn't recommend it.

void* PreFaultStack()
{
    const size_t NUM_PAGES_TO_PRE_FAULT = 50;
    const size_t size = NUM_PAGES_TO_PRE_FAULT * numa_pagesize();
    void *allocaBase = alloca(size);
    memset(allocaBase, 0, size);
    return allocaBase;
}

void SetAffinityAndRelocateStack(int cpuNum)
{
    assert(-1 != cpuNum);
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(cpuNum, &cpuset);
    const int rc = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
    assert(0 == rc);

    pthread_attr_t attr;
    void *stackAddr = nullptr;
    size_t stackSize = 0;
    if ((0 != pthread_getattr_np(pthread_self(), &attr)) || (0 != pthread_attr_getstack(&attr, &stackAddr, &stackSize))) {
        assert(false);
    }

    const unsigned long nodeMask = 1UL << numa_node_of_cpu(cpuNum);
    const auto bindRc = mbind(stackAddr, stackSize, MPOL_BIND, &nodeMask, sizeof(nodeMask), MPOL_MF_MOVE | MPOL_MF_STRICT);
    assert(0 == bindRc);

    PreFaultStack();
    // TODO: Also lock the stack with mlock() to guarantee it stays resident in RAM
    return;
}


来源:https://stackoverflow.com/questions/10605766/allocating-a-threads-stack-on-a-specific-numa-memory

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!