Multi-threading concept and lock in c#

问题

I read about lock, though not understood nothing at all. My question is why do we use a un-used object and lock that and how this makes something thread-safe or how this helps in multi-threading ? Isn't there other way to make thread-safe code.

public class test {
    private object Lock { get; set; }

    ...
    lock (this.Lock) { ... }
    ...
}

Sorry is my question is very stupid, but i don't understand, although i've used it many times.

回答1:

Accessing a piece of data from one thread while other thread is modifying it is called "data race condition" (or just "data race") and can lead to corruption of data. (*)

Locks are simply a mechanism for avoiding data races. If two (or more) concurrent threads lock the same lock object, then they are no longer concurrent and can no longer cause data races, for the duration of the lock. Essentially, we are serializing the access to shared data.

The trick is to keep your locks as "wide" as you must to avoid data races, yet as "narrow" as you can to gain performance through concurrent execution. This is a fine balance that can easily go out of whack in either direction, which is why multi-threaded programming is hard.

Some guidelines:

As long all threads are just reading the data and none will ever modify it, lock is unnecessary.
Conversely, if at least one thread might at some point modify the data, then all concurrent code paths accessing that same data must be properly serialized through locks, even those that only read the data.
- Using a lock in one code path but not the other will leave the data wide open to race conditions.
- Also, using one lock object in one code path, but a different lock object in another (concurrent) code path does not serialize these code paths and leaves you wide open to data races.
- On the other hand, if two concurrent code paths access different data, they can use different lock objects. But, whenever there is more than one lock object, watch out for deadlocks. A deadlock is often also a "code race condition" (and a heisenbug, see below).
The lock object does not need to be (and usually isn't) the same thing as the data you are trying to protect. Unfortunately, there is no language facility that lets you "declare" which data is protected by which lock object, so you'll have to very carefully document your "locking convention" both for other people that might maintain your code, and for yourself (since even after a short time you will forget some of the nooks and crannies of your locking convention).
It's usually a good idea to protect the lock object from the outside world as much as you can. After all, you are using it for the very sensitive task of locking and you don't want it locked by external actors in unforeseen ways. That's why using this or a public field as a lock object is usually a bad idea.
The lock keyword is simply a more convenient syntax for Monitor.Enter and Monitor.Exit.
The lock object can be any object in .NET, but value objects will be boxed in the call to Monitor.Enter, which means threads will not share the same lock object, leaving the data unprotected. Therefore, only use reference types as lock objects.
For inter-process communication you can use a global mutex, which can be created by passing a non-empty name to Mutex Constructor. Global mutexes provide essentially the same functionality as regular "local" locking, except they can be shared between separate processes.
There are synchronization mechanisms other than locks, such as semaphores, condition variables, message queues or atomic operations. Be careful when mixing different synchronization mechanisms.
Locks also behave as memory barriers, which is increasingly important on modern multi-core, multi-cache CPUs. This is part of the reason why you need locks on reading the data and not just writing.

(*) It is called "race" because concurrent threads are "racing" towards performing an operation on the shared data and whoever wins that race determines the outcome of the operation. So the outcome depends on timing of the execution, which is essentially random on modern preemptive multitasking OSes. Worse yet, timing is easily modified by a simple act of observing the program execution through tools such as debugger, which makes them "heisenbugs" (i.e. the phenomenon being observed is changed by the mere act of observation).

回答2:

Lock object is like a door into the single room where only one guest per time can enter. The room can be your data, the guest can be your function.

define data (room)
add door (lock object)
invite guests (functions)
using lock insctruction close/open door to allow only one guest per time enter into the room.

Why we need this? If you simulatniously write a data in a file (just an example, can be 1000s others) you will need to sync an access of your funcitons (close/open door for guests) to the write file, so any function will append to the end of the file (assuming that is requierement of this example)

This is naturally not only way sync the threads, there are more out there:

Monitors
Wait hadlers ...

Check out the link for complete information and description of each of them

Thread Synchronization

回答3:

Yes, there is indeed another way:

using System.Runtime.CompilerServices;

class Test
{
    private object Lock { get; set; }

    [MethodImpl(MethodImplOptions.Synchronized)]
    public void Foo()
    {
        // Now this instance is locked
    }
}

While it looks more "natural", it's not used often, because of the fact that the object is locking on itself this way, so other code could not risk locking on this object -- it could cause a deadlock.

Because of this, you usually create a (lazy-initialized) private field referring to an object, and use that object as a lock instead. This will guarantee that no one else can lock against the same object as you.

A little more detail on what's happening beneath the hood:

When you "lock on an object", you're not locking on the object itself. Rather, you're using the object as a guaranteed-to-be-unique-address-in-memory throughout your program. When you "lock", the runtime takes the object's address, uses it to look up the actual lock inside another table (which is hidden from you), and uses that object as the ""lock" (also known as a "critical section").

So really, for you, an object is just a proxy/symbol -- it isn't doing anything by itself; it's just acting as a unique indicator that will never clash with another valid object in the same program.

回答4:

When you have different threads accessing same variable/resource at the same time they may over write on this variable/resource and you can have unexpected results. Lock will make sure only one thread can assess variable at on time and remain thread will queue to get access to this variable/resource till lock is released

suppose we have balance variable of an account. Two different thread read its value which was 100 Suppose first thread adds 50 to it like 100 + 50 and saves it and balance will have 150 As second thread already read 100 and mean while. suppose it subtract 50 like 100-50 but point to note here is that first thread has made the balance 150 so second thread should to 150-50 this could cause serious problems.

So lock makes sure that when on thread wants to change some resource states it locks it and leaves after committing change

回答5:

The lock statement introduces the concept of mutual exclusion. Only one thread can acquire a lock on a given object at any one time. This prevents threads from accessing shared data structures concurrently, thus corrupting them.

If other threads already hold a lock, the lock statement will block until it is able to acquire an exclusive lock on its argument before allowing its block to execute.

Note that the only thing lock does is control entry to the block of code. Access to members of the class is completely unrelated to the lock. It is up to the class itself to ensure that accesses that must be synchronized are coordinated by the use of lock or other synchronization primitives. Also note that access to some or all members may not have to be synchronized. For instance, if you want to maintain a counter, you could use the Interlocked class without locking.

An alternative to locking is lock-free data structures, which behave correctly in the presence of multiple threads. Operations on lock-free data structures must be designed very carefully, usually with the assistance of lock-free primitives such as compare-and-swap (CAS).

The general theme of such techniques is to try to perform operations on data structures atomically and detect when operations fail due to concurrent actions by other threads, followed by retries. This works well on a lightly loaded system where failures are unlikely, but can produce runaway behaviour as the failure rate climbs and retries become a dominant load. This problem can be ameliorated by backing off the retry rate, effectively throttling the load.

A more sophisticated alternative is software transactional memory. Unlike CAS, STM generalizes the concept of fail-and-retry to arbitrarily complex memory operations. In simple terms, you start a transaction, perform all your operations, and finally commit. The system detects if the operations cannot succeed due to conflicting operations performed by other threads that beat the current thread to the punch. In such cases, STM can either fail outright, requiring the application to take corrective action, or, in more sophisticated implementations, it can automatically go back to the start of the transaction and try again.

回答6:

Your confusion is pretty typical for those just getting familiar with the lock keyword in C#. You are right, the object used in the lock statement is really nothing more than a token that defines a critical section. That object, in no way, has any protection from multithreaded access itself.

The way this works is that the CLR reserves a 4 byte (32-bit systems) section in the object header (type handle) called the sync block. The sync block is nothing more than an index into an array that stores the actual critical section information. When you use the lock keyword the CLR will modify this sync block value accordingly.

There are advantages and disadvantages to this scheme. The advantage is that it made for a fairly elegant solution to defining critical sections. One obvious disadvantage is that each object instance contains the sync block and most instances never use it so it would seem to be a waste of space in most cases. Another disadvantage is that boxed value types can be used which is almost always wrong and certainly leads to confusion.

I remember way back when .NET was first released that there was a lot of chatter over whether the lock keyword was good or bad for the language. The general consensus (at least as I remember it) was that it was bad because the using keyword could have been easily used instead. In fact, a solution that used the using keyword actually would have made more sense because it could have been done without the need for the sync block. The c# design team even went on record to say that had they been given a second chance the lock keyword never would have made it into the language.¹

¹The only reference I could find for this is on Jon Skeet's website here.

来源：https://stackoverflow.com/questions/10152613/multi-threading-concept-and-lock-in-c-sharp

标签

multithreading

locking