iOS Concurrency - Not reaching anywhere's near theoretical maximum

前端 未结 2 658
臣服心动
臣服心动 2020-12-22 03:31

I\'m new to Grand Central Dispatch and have been running some tests with it doing some processing on an image. Basically I\'m running a grayscale algorithm both sequentially

2条回答
  •  一整个雨季
    2020-12-22 04:13

    Most likely guess. In the single-threaded case, you are CPU bound. In the multi-threaded case, you are memory bound. In other words, the two cores are reading the data from DRAM at the maximum bus bandwidth. As a result, the cores end up idling waiting for more data to process.

    You can test my theory by doing a true luminance calculation:

    int value = floor( 0.299 * red + 0.587 * green + 0.114 * blue );
    

    That calculation will yield gray scale values in the range from 0 to 255, given 8-bit rgb values. It also gives the processors more work to do per pixel. If you change that line of code, the time for the single threaded case should increase somewhat. And, if I'm correct, then the multi-threaded case should show a better performance improvement, as a percentage of the single-threaded time.


    I decided to run some benchmarks of my own, both on the simulator and on an iPad2. The structure of my code was as follows.

    Single Threaded

    start = TimeStamp();
    
    for ( y = 0; y < 2048; y++ )
        for ( x = 0; x < 1536; x++ )
            computePixel();
    
    end = TimeStamp();
    NSLog( @"single      = %8.3lf msec", (end - start) * 1e3 );
    

    Two Threads using GCD

    dispatch_group_t tasks = dispatch_group_create();
    dispatch_queue_t queue = dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_HIGH, 0 );
    
    start = TimeStamp();
    dispatch_group_async( tasks, queue, 
    ^{
        topStart = TimeStamp();
    
        for ( y = 0; y < 1024; y++ )
            for ( x = 0; x < 1536; x++ )
                computePixel();
    
        topEnd = TimeStamp();
    });
    
    dispatch_group_async( tasks, queue, 
    ^{
        bottomStart = TimeStamp();
    
        for ( y = 1024; y < 2048; y++ )
            for ( x = 0; x < 1536; x++ )
                computePixel();
    
        bottomEnd = TimeStamp();
    });
    
    wait = TimeStamp();
    dispatch_group_wait( tasks, DISPATCH_TIME_FOREVER );
    end = TimeStamp();
    
    NSLog( @"wait        = %8.3lf msec", (wait - start) * 1e3 );
    NSLog( @"topStart    = %8.3lf msec", (topStart - start) * 1e3 );
    NSLog( @"bottomStart = %8.3lf msec", (bottomStart - start) * 1e3 );
    NSLog( @" " );
    NSLog( @"topTime     = %8.3lf msec", (topEnd - topStart) * 1e3 );
    NSLog( @"bottomeTime = %8.3lf msec", (bottomEnd - bottomStart) * 1e3 );
    NSLog( @"overallTime = %8.3lf msec", (end - start) * 1e3 );
    

    Here are my results.

    Running (r+g+b)/3 on the simulator

    2014-04-03 23:16:22.239 GcdTest[1406:c07] single      =   21.546 msec
    2014-04-03 23:16:22.239 GcdTest[1406:c07]  
    2014-04-03 23:16:25.388 GcdTest[1406:c07] wait        =    0.009 msec
    2014-04-03 23:16:25.388 GcdTest[1406:c07] topStart    =    0.031 msec
    2014-04-03 23:16:25.388 GcdTest[1406:c07] bottomStart =    0.057 msec
    2014-04-03 23:16:25.389 GcdTest[1406:c07]  
    2014-04-03 23:16:25.389 GcdTest[1406:c07] topTime     =   10.865 msec
    2014-04-03 23:16:25.389 GcdTest[1406:c07] bottomeTime =   10.879 msec
    2014-04-03 23:16:25.390 GcdTest[1406:c07] overallTime =   10.961 msec
    

    Running (.299r + .587g + .114b) on the simulator

    2014-04-03 23:17:27.984 GcdTest[1422:c07] single      =   55.738 msec
    2014-04-03 23:17:27.985 GcdTest[1422:c07]  
    2014-04-03 23:17:29.306 GcdTest[1422:c07] wait        =    0.008 msec
    2014-04-03 23:17:29.307 GcdTest[1422:c07] topStart    =    0.054 msec
    2014-04-03 23:17:29.307 GcdTest[1422:c07] bottomStart =    0.060 msec
    2014-04-03 23:17:29.307 GcdTest[1422:c07]  
    2014-04-03 23:17:29.308 GcdTest[1422:c07] topTime     =   28.881 msec
    2014-04-03 23:17:29.308 GcdTest[1422:c07] bottomeTime =   29.330 msec
    2014-04-03 23:17:29.308 GcdTest[1422:c07] overallTime =   29.446 msec
    

    Running (r+g+b)/3 on the iPad2

    2014-04-03 23:27:19.601 GcdTest[13032:907] single      =  298.799 msec
    2014-04-03 23:27:19.602 GcdTest[13032:907]  
    2014-04-03 23:27:20.536 GcdTest[13032:907] wait        =    0.060 msec
    2014-04-03 23:27:20.537 GcdTest[13032:907] topStart    =    0.246 msec
    2014-04-03 23:27:20.539 GcdTest[13032:907] bottomStart =    2.906 msec
    2014-04-03 23:27:20.541 GcdTest[13032:907]  
    2014-04-03 23:27:20.542 GcdTest[13032:907] topTime     =  149.596 msec
    2014-04-03 23:27:20.544 GcdTest[13032:907] bottomeTime =  149.209 msec
    2014-04-03 23:27:20.545 GcdTest[13032:907] overallTime =  152.164 msec
    

    Running (.299r + .587g + .114b) on the iPad2

    2014-04-03 23:30:29.618 GcdTest[13045:907] single      =  282.767 msec
    2014-04-03 23:30:29.620 GcdTest[13045:907]  
    2014-04-03 23:30:34.008 GcdTest[13045:907] wait        =    0.046 msec
    2014-04-03 23:30:34.010 GcdTest[13045:907] topStart    =    0.270 msec
    2014-04-03 23:30:34.011 GcdTest[13045:907] bottomStart =    3.043 msec
    2014-04-03 23:30:34.013 GcdTest[13045:907]  
    2014-04-03 23:30:34.014 GcdTest[13045:907] topTime     =  143.078 msec
    2014-04-03 23:30:34.015 GcdTest[13045:907] bottomeTime =  143.249 msec
    2014-04-03 23:30:34.017 GcdTest[13045:907] overallTime =  146.350 msec
    

    Running ((.299r + .587g + .114b) ^ 2.2) on the iPad2

    2014-04-03 23:41:28.959 GcdTest[13078:907] single      = 1258.818 msec
    2014-04-03 23:41:28.961 GcdTest[13078:907]  
    2014-04-03 23:41:30.768 GcdTest[13078:907] wait        =    0.048 msec
    2014-04-03 23:41:30.769 GcdTest[13078:907] topStart    =    0.264 msec
    2014-04-03 23:41:30.771 GcdTest[13078:907] bottomStart =    3.037 msec
    2014-04-03 23:41:30.772 GcdTest[13078:907]  
    2014-04-03 23:41:30.773 GcdTest[13078:907] topTime     =  635.952 msec
    2014-04-03 23:41:30.775 GcdTest[13078:907] bottomeTime =  634.749 msec
    2014-04-03 23:41:30.776 GcdTest[13078:907] overallTime =  637.829 msec
    

提交回复
热议问题