My problem is, I have block matrix updated by multiple threads. Multiple threads may be updating disjoint block at a time but in general there may be race conditions. right
It might be a several approaches you can use:
Pre-allocate array if CriticalSection/Mutexes (2500 is not so many), and use block index as a lock index to gather block access; before updating blocks lock all of them which you want to change; do update; unlock;
If computation time is significantly longer then Lock/Unlock, then copy content of the block in thread context and keep block unlocked during that time; before updating block lock it again and make sure that it was not updated by another thread (if it's relevant for this); if it was updated by another thread, then repeat operation;
If the size of the block content is small, use atomic data exchange to update block content, no locks required; just not sure if you use data from one blocks to calculate data for another, in that case locking across all updated blocks required;
Is there reading operation on the matrix? If yes use Read/Write locks to improve performance.