Why is my System.nanoTime() broken?

巧了我就是萌 提交于 2019-11-30 08:13:22

问题


Myself and another developer on my time recently moved from a Core 2 Duo machine at work to a new Core 2 Quad 9505; both running Windows XP SP3 32-bit with JDK 1.6.0_18.

Upon doing so, a couple of our automated unit tests for some timing/statistics/metrics aggregation code promptly started failing, due to what appear to be ridiculous values coming back from System.nanoTime().

Test code that shows this behaviour, reliably, on my machine is:

import static org.junit.Assert.assertThat;

import org.hamcrest.Matchers;
import org.junit.Test;

public class NanoTest {

  @Test
  public void testNanoTime() throws InterruptedException {
    final long sleepMillis = 5000;

    long nanosBefore = System.nanoTime();
    long millisBefore = System.currentTimeMillis();

    Thread.sleep(sleepMillis);

    long nanosTaken = System.nanoTime() - nanosBefore;
    long millisTaken = System.currentTimeMillis() - millisBefore;

    System.out.println("nanosTaken="+nanosTaken);
    System.out.println("millisTaken="+millisTaken);

    // Check it slept within 10% of requested time
    assertThat((double)millisTaken, Matchers.closeTo(sleepMillis, sleepMillis * 0.1));
    assertThat((double)nanosTaken, Matchers.closeTo(sleepMillis * 1000000, sleepMillis * 1000000 * 0.1));
  }

}

Typical output:

millisTaken=5001
nanosTaken=2243785148

Running it 100x yields nano results between 33% and 60% of the actual sleep time; usually around 40% though.

I understand the weaknesses in accuracy of timers in Windows, and have read related threads like Is System.nanoTime() consistent across threads?, however my understanding is that System.nanoTime() is intended for exactly the purpose we're using it :- measuring elapsed time; more accurately than currentTimeMillis().

Does anyone know why it's returning such crazy results? Is this likely to be a hardware architecture problem (the only major thing that has changed is the CPU/Motherboard on this machine)? A problem with the Windows HAL with my current hardware? A JDK problem? Should I abandon nanoTime()? Should I log a bug somewhere, or any suggestions on how I could investigate further?

UPDATE 19/07 03:15 UTC: After trying finnw's test case below I did some more Googling, coming across entries such as bugid:6440250. It also reminded me of some other strange behaviour I noticed late Friday where pings were coming back negative. So I added /usepmtimer to my boot.ini and now all the tests behave as expected., and my pings are normal too.

I'm a bit confused about why this was still an issue though; from my reading I thought TSC vs PMT issues were largely resolved in Windows XP SP3. Could it be because my machine was originally SP2, and was patched to SP3 rather than installed originally as SP3? I now also wonder whether I should be installing patches like the one at MS KB896256. Maybe I should take this up with the corporate desktop build team?


回答1:


The problem was resolved (with some open suspicions about the suitability of nanoTime() on multi-core systems!) by adding /usepmtimer to the end of my C:\boot.ini string; forcing Windows to use the Power Management timer rather than the TSC. It's an open question as to why I needed to do this given I'm on XP SP3, as I understood that this was the default, however perhaps it was due to the manner in which my machine was patched to SP3.




回答2:


On my system (Windows 7 64-Bit, Core i7 980X):

nanosTaken=4999902563
millisTaken=5001

System.nanoTime() uses OS-specific calls, so I expect that you are seeing a bug in your Windows/processor combination.




回答3:


You probably want to read the answers to this other stack overflow question: Is System.nanoTime() completely useless?.

In summary, it would appear that nanoTime relies on operating system timers that may be affected by the presence of multiple core CPUs. As such, nanoTime may not be that useful on certain combinations of OS and CPU, and care should be taken when using it in portable Java code that you intend to run on multiple target platforms. There seems to be a lot of complaining on the web on this subject, but not much consensus on a meaningful alternative.




回答4:


It is difficult to tell whether this is a bug or just normal timer variation between cores.

An experiment you could try is to use native calls to force the thread to run on a specific core.

Also, to rule out power management effects, try spinning in a loop as an alternative to sleep():

import com.sun.jna.Native;
import com.sun.jna.NativeLong;
import com.sun.jna.platform.win32.Kernel32;
import com.sun.jna.platform.win32.W32API;

public class AffinityTest {

    private static void testNanoTime(boolean sameCore, boolean spin)
    throws InterruptedException {
        W32API.HANDLE hThread = kernel.GetCurrentThread();
        final long sleepMillis = 5000;

        kernel.SetThreadAffinityMask(hThread, new NativeLong(1L));
        Thread.yield();
        long nanosBefore = System.nanoTime();
        long millisBefore = System.currentTimeMillis();

        kernel.SetThreadAffinityMask(hThread, new NativeLong(sameCore? 1L: 2L));
        if (spin) {
            Thread.yield();
            while (System.currentTimeMillis() - millisBefore < sleepMillis)
                ;
        } else {
            Thread.sleep(sleepMillis);
        }

        long nanosTaken = System.nanoTime() - nanosBefore;
        long millisTaken = System.currentTimeMillis() - millisBefore;

        System.out.println("nanosTaken="+nanosTaken);
        System.out.println("millisTaken="+millisTaken);
    }

    public static void main(String[] args) throws InterruptedException {
        System.out.println("Sleeping, different cores");
        testNanoTime(false, false);
        System.out.println("\nSleeping, same core");
        testNanoTime(true, false);
        System.out.println("\nSpinning, different cores");
        testNanoTime(false, true);
        System.out.println("\nSpinning, same core");
        testNanoTime(true, true);
    }

    private static final Kernel32Ex kernel =
        (Kernel32Ex) Native.loadLibrary(Kernel32Ex.class);

}

interface Kernel32Ex extends Kernel32 {
    NativeLong SetThreadAffinityMask(HANDLE hThread, NativeLong dwAffinityMask);
}

If you get very different results depending on core selection (e.g. 5000ms on the same core but 2200ms on different cores) that would suggest that the problem is just natural timer variation between cores.

If you get very different results from sleeping vs. spinning, it is more likely due to power management slowing down the clocks.

If none of the four results are close to 5000ms, then it might be a bug.



来源:https://stackoverflow.com/questions/3274892/why-is-my-system-nanotime-broken

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!