In my multithreaded application and I see heavy lock contention in it, preventing good scalability across multiple cores. I have decided to use lock free programming to solv
As sblundy pointed out, if all objects are immutable, read-only, you don't need to worry about locking, however, this means you may have to copy objects a lot. Copying usually involves malloc and malloc uses locking to synchronize memory allocations across threads, so immutable objects may buy you less than you think (malloc itself scales rather badly and malloc is slow; if you do a lot of malloc in a performance critical section, don't expect good performance).
When you only need to update simple variables (e.g. 32 or 64 bit int or pointers), perform simply addition or subtraction operations on them or just swap the values of two variables, most platforms offer "atomic operations" for that (further GCC offers these as well). Atomic is not the same as thread-safe. However, atomic makes sure, that if one thread writes a 64 bit value to a memory location for example and another thread reads from it, the reading one either gets the value before the write operation or after the write operation, but never a broken value in-between the write operation (e.g. one where the first 32 bit are already the new, the last 32 bit are still the old value! This can happen if you don't use atomic access on such a variable).
However, if you have a C struct with 3 values, that want to update, even if you update all three with atomic operations, these are three independent operations, thus a reader might see the struct with one value already being update and two not being updated. Here you will need a lock if you must assure, the reader either sees all values in the struct being either the old or the new values.
One way to make locks scale a lot better is using R/W locks. In many cases, updates to data are rather infrequent (write operations), but accessing the data is very frequent (reading the data), think of collections (hashtables, trees). In that case R/W locks will buy you a huge performance gain, as many threads can hold a read-lock at the same time (they won't block each other) and only if one thread wants a write lock, all other threads are blocked for the time the update is performed.
The best way to avoid thread-issues is to not share any data across threads. If every thread deals most of the time with data no other thread has access to, you won't need locking for that data at all (also no atomic operations). So try to share as little data as possible between threads. Then you only need a fast way to move data between threads if you really have to (ITC, Inter Thread Communication). Depending on your operating system, platform and programming language (unfortunately you told us neither of these), various powerful methods for ITC might exist.
And finally, another trick to work with shared data but without any locking is to make sure threads don't access the same parts of the shared data. E.g. if two threads share an array, but one will only ever access even, the other one only odd indexes, you need no locking. Or if both share the same memory block and one only uses the upper half of it, the other one only the lower one, you need no locking. Though it's not said, that this will lead to good performance; especially not on multi-core CPUs. Write operations of one thread to this shared data (running one core) might force the cache to be flushed for another thread (running on another core) and these cache flushes are often the bottle neck for multithread applications running on modern multi-core CPUs.