According to the official document, we cam simply use collective I/O to boost performance. For example, I can use n processes to read the same file with different offsets as