microbenchmark | 易学教程

First time a Java loop is run SLOW, why? [Sun HotSpot 1.5, sparc]

阅读更多关于 First time a Java loop is run SLOW, why? [Sun HotSpot 1.5, sparc]

问题 In benchmarking some Java code on a Solaris SPARC box, I noticed that the first time I call the benchmarked function it runs EXTREMELY slowly (10x difference): First | 1 | 25295.979 ms Second | 1 | 2256.990 ms Third | 1 | 2250.575 ms Why is this? I suspect the JIT compiler, is there any way to verify this? Edit: In light of some answers I wanted to clarify that this code is the simplest possible test-case I could find exhibiting this behavior. So my goal isn't to get it to run fast, but to

Why lambda IntStream.anyMatch() is 10 slower than naive implementation?

阅读更多关于 Why lambda IntStream.anyMatch() is 10 slower than naive implementation?

问题 I was recently profiling my code and found one interesting bottleneck in it. Here is benchmark : @BenchmarkMode(Mode.Throughput) @Fork(1) @State(Scope.Thread) @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) public class Contains { private int[] ar = new int[] {1,2,3,4,5,6,7}; private int val = 5; @Benchmark public boolean naive() { return contains(ar, val); } @Benchmark public boolean

Why the bounds check doesn't get eliminated?

阅读更多关于 Why the bounds check doesn't get eliminated?

I wrote a simple benchmark in order to find out if bounds check can be eliminated when the array gets computed via bitwise and. This is basically what nearly all hash tables do: They compute h & (table.length - 1) as an index into the table , where h is the hashCode or a derived value. The results shows that the bounds check don't get eliminated. The idea of my benchmark is pretty simple: Compute two values i and j , where both are guaranteed to be valid array indexes. i is the loop counter. When it gets used as array index, the bounds check gets eliminated. j gets computed as x & (table

Java for loop performance question

阅读更多关于 Java for loop performance question

问题 considering this example: public static void main(final String[] args) { final List<String> myList = Arrays.asList("A", "B", "C", "D"); final long start = System.currentTimeMillis(); for (int i = 1000000; i > myList.size(); i--) { System.out.println("Hello"); } final long stop = System.currentTimeMillis(); System.out.println("Finish: " + (stop - start)); } vs public static void main(final String[] args) { final List<String> myList = Arrays.asList("A", "B", "C", "D"); final long start = System

Why is the StringBuilder chaining pattern sb.append(x).append(y) faster than regular sb.append(x); sb.append(y)?

阅读更多关于 Why is the StringBuilder chaining pattern sb.append(x).append(y) faster than regular sb.append(x); sb.append(y)?

问题 I have a microbenchmark that shows very strange results: @BenchmarkMode(Mode.Throughput) @Fork(1) @State(Scope.Thread) @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS, batchSize = 1000) @Measurement(iterations = 40, time = 1, timeUnit = TimeUnit.SECONDS, batchSize = 1000) public class Chaining { private String a1 = "111111111111111111111111"; private String a2 = "222222222222222222222222"; private String a3 = "333333333333333333333333"; @Benchmark public String typicalChaining(

Strange JIT pessimization of a loop idiom

阅读更多关于 Strange JIT pessimization of a loop idiom

问题 While analyzing the results of a recent question here, I encountered a quite peculiar phenomenon: apparently an extra layer of HotSpot's JIT-optimization actually slows down execution on my machine. Here is the code I have used for the measurement: @OutputTimeUnit(TimeUnit.NANOSECONDS) @BenchmarkMode(Mode.AverageTime) @OperationsPerInvocation(Measure.ARRAY_SIZE) @Warmup(iterations = 2, time = 1) @Measurement(iterations = 5, time = 1) @State(Scope.Thread) @Threads(1) @Fork(2) public class

Weird performance effects from nearby dependent stores in a pointer-chasing loop on IvyBridge. Adding an extra load speeds it up?

阅读更多关于 Weird performance effects from nearby dependent stores in a pointer-chasing loop on IvyBridge. Adding an extra load speeds it up?

问题 First I have the below setup on an IvyBridge, I will insert measuring payload code in the commented location. The first 8 bytes of buf store the address of buf itself, I use this to create loop-carried dependency: section .bss align 64 buf: resb 64 section .text global _start _start: mov rcx, 1000000000 mov qword [buf], buf mov rax, buf loop: ; I will insert payload here ; as is described below dec rcx jne loop xor rdi, rdi mov rax, 60 syscall case 1: I insert into the payload location: mov

How do I write a correct micro-benchmark in Java?

阅读更多关于 How do I write a correct micro-benchmark in Java?

问题 How do you write (and run) a correct micro-benchmark in Java? I\'m looking for some code samples and comments illustrating various things to think about. Example: Should the benchmark measure time/iteration or iterations/time, and why? Related: Is stopwatch benchmarking acceptable? 回答1: Tips about writing micro benchmarks from the creators of Java HotSpot: Rule 0: Read a reputable paper on JVMs and micro-benchmarking. A good one is Brian Goetz, 2005. Do not expect too much from micro