A python program I created is IO bounded. The majority of the time (over 90%) is spent in a single loop which repeats ~10,000 times. In this loop, ~100KB data is generated a
In my tests I've found that not only batch size affects overall performance, but also the nature of data itself. I've managed to get 5 times better write times compared to SSD in only one scenario: writing a 100MB chunk of pre-cooked random byte array to RAM drive. Writing more "predictable" data like letters "aaa" or current datetime yields quite opposite results - SSD is always faster or equal. So my guess is that opertating system (Win 7 in my case) does lots of caching and optimizations. Looks like the most hindering case for RAM-drive is when you perform lots of small writes instead of a few big ones, and RAM drive shines at writing large amounts of hard-to-compress data.