Synchronizing read/write access to an instance variable for high performance in iOS?

问题

What's the best way / least wait-causing way to synchronize read/write access to an instance variable in objective-c for iOS?

The variable gets read and written very often (let's say 1000 times per second read and written). It is not important that changes take effect immediately. It is not even important that reads get consistent data with one-another, but writes must sooner or later be reflected in the data acquired by reads. Is there some data structure which allows this?

I thought of this:

Create two variables instead of one variable; let's call them v[0] and v[1].
For each v[i], create a concurrent dispatch queue for constructing a readers-writer-locking mechanism around it. Let's call them q[i].
Now for a writing operation, only v[0] gets written to, adhering to the locking mechanism with q[0].
On a read operation, at first v[1] is read and only at a certain chance, e.g. 1%, the read operation looks into v[0] and updates v[1] if necessary.

The following pseudo-code illustrates this:

typedef int VType; // the type of the variable

VType* v; // array of first and second variable
dispatch_queue_t* q; // queues for synchronizing access to v[i]

- (void) setV:(VType)newV {
    [self setV:newV at:0];
}

- (void) setV:(VType)newV at:(int)i {
    dispatch_barrier_async(q[i], ^{
        v[i] = newV;
    });
}

- (VType) getV:(int)i {
    __block VType result;

    dispatch_sync(q[i], ^{
        result = v[i];
    });

    return result;
}

- (VType) getV {
    VType result = [self getV:1];

    if ([self random] < 0.01) {
        VType v0_result = [self getV:0];

        if (v0_result != result) {
            [self setV:v0_result at:1];
            result = v0_result;
        }
    }

    return result;
}

- (float) random {
    // some random number generator - fast, but not necessarily good
}

This has the following benefits:

v[0] is usually not occupied with a read operation. Therefor, a write operation usually does not block.
At most times, v[1] does not get written to, thus read operations on this one usually don't block.
Still, if many read operations occur, eventually the written values are propagated from v[0] into v[1]. Some values might be missed, but that doesn't matter for my application.

What do you guys think, does this work? Are there better solutions?

UPDATE:

Some performance benchmarking (reads and writes of one benchmark at a time are done as quickly as possible concurrently for 1 second, one reading queue, one writing queue):

On iPhone 4S with iOS 7:

runMissingSyncBenchmark: 484759 w/s
runMissingSyncBenchmark: 489558 r/s
runConcurrentQueueRWSyncBenchmark: 2303 w/s
runConcurrentQueueRWSyncBenchmark: 2303 r/s
runAtomicPropertyBenchmark: 460479 w/s
runAtomicPropertyBenchmark: 462145 r/s

In Simulator with iOS 7:

runMissingSyncBenchmark: 16303208 w/s
runMissingSyncBenchmark: 12239070 r/s
runConcurrentQueueRWSyncBenchmark: 2616 w/s
runConcurrentQueueRWSyncBenchmark: 2615 r/s
runAtomicPropertyBenchmark: 4212703 w/s
runAtomicPropertyBenchmark: 4300656 r/s

So far, atomic property wins. Tremendously. This was tested with an SInt64.

I expected that the approach with the concurrent queue is similar in performance to the atomic property, as it is the standard approach for an r/w-sync mechanism.

Of course, the runMissingSyncBenchmark sometimes produces reads which show that a write of the SInt64 is halfway done.

回答1:

Perhaps, a spinlock will be optimal (see man 3 spinlock).

Since a spin lock can be tested if it is currently locked (which is a fast operation) the reader task could just return the previous value if the spin lock is held by the writer task.

That is, the reader task uses OSSpinLockTry() and retrieves the actual value only if the lock could be obtained. Otherwise, the read task will use the previous value.

The writer task will use OSSpinLockLock() and OSSpinLockUnlock() respectively in order to atomically update the value.

From the man page:

NAME OSSpinLockTry, OSSpinLockLock, OSSpinLockUnlock -- atomic spin lock synchronization primitives

SYNOPSIS
 #include <libkern/OSAtomic.h>

 bool
 OSSpinLockTry(OSSpinLock *lock);

 void
 OSSpinLockLock(OSSpinLock *lock);

 void
 OSSpinLockUnlock(OSSpinLock *lock);
DESCRIPTION

Spin locks are a simple, fast, thread-safe synchronization primitive that is suitable in situations where contention is expected to be low. The spinlock operations use memory barriers to synchronize access to shared memory protected by the lock. Preemption is possible while the lock is held.

OSSpinLockis an integer type. The convention is that unlocked is zero, and locked is nonzero. Locks must be naturally aligned and cannot be in cache-inhibited memory.

OSSpinLockLock() will spin if the lock is already held, but employs various strategies to back off, making it immune to most priority-inversion livelocks. But because it can spin, it may be inefficient in some situations.

OSSpinLockTry() immediately returns false if the lock was held, true if it took the lock. It does not spin.

OSSpinLockUnlock() unconditionally unlocks the lock by zeroing it.

RETURN VALUES

OSSpinLockTry() returns true if it took the lock, false if the lock was already held.

回答2:

I think CouchDeveloper's suggestion of using try-checks in the synchronization locks is an intriguing possibility. In my particular experiments, it had negligible impact with spin locks, modest gain for pthread read-write lock, and most significant impact with simple mutex lock). I'd wager that difference configurations would achieve some gain with spin locks, too, but must have I failed to get enough contention with spin locks for the impact of using try to be observable.

If you're working with immutable or fundamental data types, you can also use the atomic property as described in the Synchronization Tools section in the Threading Programming Guide:

Atomic operations are a simple form of synchronization that work on simple data types. The advantage of atomic operations is that they do not block competing threads. For simple operations, such as incrementing a counter variable, this can lead to much better performance than taking a lock.

Unaware that you had done your own benchmarking, I benchmarked a couple of these techniques discussed in that document (doing mutex lock and pthread read/write lock both with and without the "try" algorithm), as well as the GCD reader-writer pattern. In my test, I did 5m reads while doing 500k writes of random values. This yielded the following benchmarks (measured in seconds, smaller being better).

| Tables                    | Simulator | Device   | 
+---------------------------+-----------+----------+
| Atomic                    |       1.9 |      7.2 |
| Spinlock w/o try          |       2.8 |      8.0 |
| Pthread RW lock w/ try    |       2.9 |      9.1 |
| Mutex lock w/ try         |       2.9 |      9.4 |
| GCD reader-writer pattern |       3.2 |      9.1 |
| Pthread RW lock w/o try   |       7.2 |     22.2 |
| NSLock                    |      23.1 |     89.7 |
| Mutex lock w/o try        |      24.2 |     80.2 |
| @synchronized             |      25.2 |     92.0 |

Bottom line, in this particular test, atomic properties performed the best. Obviously, atomic properties have significant limitations, but in your scenario, it sounds like this is acceptable. These results are obviously going to be subject to the specifics of your scenario, and it sounds like your testing has confirmed that atomic properties yielded the best performance for you.

来源：https://stackoverflow.com/questions/20851332/synchronizing-read-write-access-to-an-instance-variable-for-high-performance-in

标签

ios

iphone

multithreading

synchronization