microbenchmark | 易学教程

Capturing (externally) the memory consumption of a given Callback

阅读更多关于 Capturing (externally) the memory consumption of a given Callback

The Problem Lets say I have this function: function hog($i = 1) // uses $i * 0.5 MiB, returns $i * 0.25 MiB { $s = str_repeat('a', $i * 1024 * 512); return substr($s, $i * 1024 * 256); } I would like to call it and be able to inspect the maximum amount of memory it uses. In other words: memory_get_function_peak_usage($callback); . Is this possible? What I Have Tried I'm using the following values as my non-monotonically increasing $i argument for hog() : $iterations = array_merge(range(0, 50, 10), range(50, 0, 5)); $iterations = array_fill_keys($iterations, 0); Which is essentially: ( [0] => 0

Difference between MATLAB's numel and length functions

阅读更多关于 Difference between MATLAB's numel and length functions

问题 I know that length(x) returns max(size(x)) and numel(x) returns the total number of elements of x, but which is better for a 1 by n array? Does it matter, or are they interchangeable in this case? EDIT: Just for kicks: Looks like they're the same performance-wise until you get to 100k elements. 回答1: In that case they return the same and there's no difference. In term of performance, it depends on the inner working of arrays in MATLAB. E.g. if there are metainformations about how many elements

Why jnz requires 2 cycles to complete in an inner loop

阅读更多关于 Why jnz requires 2 cycles to complete in an inner loop

I'm on an IvyBridge. I found the performance behavior of jnz inconsistent in inner loop and outer loop. The following simple program has an inner loop with fixed size 16: global _start _start: mov rcx, 100000000 .loop_outer: mov rax, 16 .loop_inner: dec rax jnz .loop_inner dec rcx jnz .loop_outer xor edi, edi mov eax, 60 syscall perf tool shows the outer loop runs 32c/iter. It suggests the jnz requires 2 cycles to complete. I then search in Agner's instruction table, conditional jump has 1-2 "reciprocal throughput", with a comment "fast if no jump". At this point I start to believe the above

First time a Java loop is run SLOW, why? [Sun HotSpot 1.5, sparc]

阅读更多关于 First time a Java loop is run SLOW, why? [Sun HotSpot 1.5, sparc]

In benchmarking some Java code on a Solaris SPARC box, I noticed that the first time I call the benchmarked function it runs EXTREMELY slowly (10x difference): First | 1 | 25295.979 ms Second | 1 | 2256.990 ms Third | 1 | 2250.575 ms Why is this? I suspect the JIT compiler, is there any way to verify this? Edit: In light of some answers I wanted to clarify that this code is the simplest possible test-case I could find exhibiting this behavior. So my goal isn't to get it to run fast, but to understand what's going on so I can avoid it in my real benchmarks. Solved: Tom Hawtin correctly pointed

Why lambda IntStream.anyMatch() is 10 slower than naive implementation?

阅读更多关于 Why lambda IntStream.anyMatch() is 10 slower than naive implementation?

I was recently profiling my code and found one interesting bottleneck in it. Here is benchmark : @BenchmarkMode(Mode.Throughput) @Fork(1) @State(Scope.Thread) @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS) public class Contains { private int[] ar = new int[] {1,2,3,4,5,6,7}; private int val = 5; @Benchmark public boolean naive() { return contains(ar, val); } @Benchmark public boolean lambdaArrayStreamContains() { return Arrays.stream(ar).anyMatch(i -> i == val); } @Benchmark public boolean

Java for loop performance question

阅读更多关于 Java for loop performance question

considering this example: public static void main(final String[] args) { final List<String> myList = Arrays.asList("A", "B", "C", "D"); final long start = System.currentTimeMillis(); for (int i = 1000000; i > myList.size(); i--) { System.out.println("Hello"); } final long stop = System.currentTimeMillis(); System.out.println("Finish: " + (stop - start)); } vs public static void main(final String[] args) { final List<String> myList = Arrays.asList("A", "B", "C", "D"); final long start = System.currentTimeMillis(); final int size = myList.size(); for (int i = 1000000; i > size; i--) { System.out

Why is the StringBuilder chaining pattern sb.append(x).append(y) faster than regular sb.append(x); sb.append(y)?

阅读更多关于 Why is the StringBuilder chaining pattern sb.append(x).append(y) faster than regular sb.append(x); sb.append(y)?

I have a microbenchmark that shows very strange results: @BenchmarkMode(Mode.Throughput) @Fork(1) @State(Scope.Thread) @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS, batchSize = 1000) @Measurement(iterations = 40, time = 1, timeUnit = TimeUnit.SECONDS, batchSize = 1000) public class Chaining { private String a1 = "111111111111111111111111"; private String a2 = "222222222222222222222222"; private String a3 = "333333333333333333333333"; @Benchmark public String typicalChaining() { return new StringBuilder().append(a1).append(a2).append(a3).toString(); } @Benchmark public String

Calculate time encryption of AES/CCM in Visual Studio 2017

阅读更多关于 Calculate time encryption of AES/CCM in Visual Studio 2017

I am using the library Crypto++ 5.6.5 and Visual Studio 2017. How can I calculate the encryption time for AES-CCM? I would like to know how to calculate the encryption time for AES-CCM. The Crypto++ wiki provides an article Benchmarks . It provides a lot of details regarding library performance, how throughput is calculated, and it even references the source code where the actual throughput is measured. Believe it or not, a simple call to clock works just fine to measure bulk encryption. Also see Benchmarks | Timing Loop in the same wiki article. To benchmark AES/CCM, do something like the

Hidden performance cost in Scala?

阅读更多关于 Hidden performance cost in Scala?

问题 I came across this old question and did the following experiment with scala 2.10.3. I rewrote the Scala version to use explicit tail recursion: import scala.annotation.tailrec object ScalaMain { private val t = 20 private def run() { var i = 10 while(!isEvenlyDivisible(2, i, t)) i += 2 println(i) } @tailrec private def isEvenlyDivisible(i: Int, a: Int, b: Int): Boolean = { if (i > b) true else (a % i == 0) && isEvenlyDivisible(i+1, a, b) } def main(args: Array[String]) { val t1 = System

Strange JIT pessimization of a loop idiom

阅读更多关于 Strange JIT pessimization of a loop idiom

While analyzing the results of a recent question here , I encountered a quite peculiar phenomenon: apparently an extra layer of HotSpot's JIT-optimization actually slows down execution on my machine. Here is the code I have used for the measurement: @OutputTimeUnit(TimeUnit.NANOSECONDS) @BenchmarkMode(Mode.AverageTime) @OperationsPerInvocation(Measure.ARRAY_SIZE) @Warmup(iterations = 2, time = 1) @Measurement(iterations = 5, time = 1) @State(Scope.Thread) @Threads(1) @Fork(2) public class Measure { public static final int ARRAY_SIZE = 1024; private final int[] array = new int[ARRAY_SIZE];