benchmarking

Benchmarking code - am I doing it right?

青春壹個敷衍的年華 提交于 2019-12-01 05:51:48
问题 I want to benchmark a C/C++ code. I want to measure cpu time, wall time and cycles/byte. I wrote some mesurement functions but have a problem with cycles/byte. To get a cpu time I wrote a function getrusage() with RUSAGE_SELF , for wall time i use clock_gettime with MONOTONIC , to get cycles/byte I use rdtsc . I process an input buffer of size, for example, 1024: char buffer[1024] . How do I benchmark: Do a warm-up phase, simply call fun2measure(args) 1000 times: for(int i=0; i<1000; i++)

Changing the file descriptor size in httperf

依然范特西╮ 提交于 2019-12-01 05:47:32
I'm doing a series of benchmarks and found the httpperf tool. But the version in my ubuntu 12.04 has a too small file descriptor size. Because it warns me with this message: httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE There used to be guide to compile httperf with a bigger size in http://gom-jabbar.org/articles/2009/02/04/httperf-and-file-descriptors but the site is down now. Does anyone knows the steps to compile the tool with the proper settings? I've always followed the instructions here , which should set the global values properly. You can

ActiveRecords select(:id).collect vs. pluck(:id) methods: Why is pure AR “pluck” slower?

喜欢而已 提交于 2019-12-01 04:42:43
I am trying to get all the ids from my Article model. I can do it two ways: Article.select(:id).collect{|a| a.id} Article Load (2.6ms) SELECT "articles"."id" FROM "articles" OR 2.2.1 :006 > Article.pluck(:id) (4.3ms) SELECT "articles"."id" FROM "articles" What gives? Why is the AR slower than the Ruby version? Even when I benchmark the Ruby method, it seems faster: Benchmark.measure{Article.select(:id).collect{|a| a.id}} Article Load (1.9ms) SELECT "articles"."id" FROM "articles" => #<Benchmark::Tms:0x007feb12060658 @label="", @real=0.026455502957105637, @cstime=0.0, @cutime=0.0, @stime=0.0,

cargo test --release causes a stack overflow. Why doesn't cargo bench?

旧时模样 提交于 2019-12-01 03:35:12
In trying to write an optimized DSP algorithm, I was wondering about relative speed between stack allocation and heap allocation, and size limits of stack-allocated arrays. I realize there is a stack frame size limit, but I don't understand why the following runs, generating seemingly realistic benchmark results with cargo bench , but fails with a stack overflow when run with cargo test --release . #![feature(test)] extern crate test; #[cfg(test)] mod tests { use test::Bencher; #[bench] fn it_works(b: &mut Bencher) { b.iter(|| { let stack = [[[0.0; 2]; 512]; 512]; }); } } kennytm To get things

ActiveRecords select(:id).collect vs. pluck(:id) methods: Why is pure AR “pluck” slower?

怎甘沉沦 提交于 2019-12-01 02:35:28
问题 I am trying to get all the ids from my Article model. I can do it two ways: Article.select(:id).collect{|a| a.id} Article Load (2.6ms) SELECT "articles"."id" FROM "articles" OR 2.2.1 :006 > Article.pluck(:id) (4.3ms) SELECT "articles"."id" FROM "articles" What gives? Why is the AR slower than the Ruby version? Even when I benchmark the Ruby method, it seems faster: Benchmark.measure{Article.select(:id).collect{|a| a.id}} Article Load (1.9ms) SELECT "articles"."id" FROM "articles" => #

Changing the file descriptor size in httperf

和自甴很熟 提交于 2019-12-01 02:12:22
问题 I'm doing a series of benchmarks and found the httpperf tool. But the version in my ubuntu 12.04 has a too small file descriptor size. Because it warns me with this message: httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE There used to be guide to compile httperf with a bigger size in http://gom-jabbar.org/articles/2009/02/04/httperf-and-file-descriptors but the site is down now. Does anyone knows the steps to compile the tool with the proper

How to Disable Dynamic Frequency Scaling?

醉酒当歌 提交于 2019-12-01 00:11:55
I would like to do some microbenchmarks, and try to do them right. Unfortunately dynamic frequency scaling makes benchmarking highly unreliable. Is there a way to programmatically (C++, Windows) find out if dynamic frequency scaling is enabled? If, can this be disabled in a program? Ive tried to just use a warmup phase that uses 100% CPU for a second before the actual benchmark takes place, but this turned out to be not reliable either. UPDATE : Even when I disable SpeedStep in the BIOS, cpu-z shows that the frequency changes between 1995 and 2826 GHz In general, you need to do the following

Is there a way to count the number of IL instructions executed?

╄→гoц情女王★ 提交于 2019-11-30 22:55:27
I want to do some benchmarking of a C# process, but I don't want to use time as my vector - I want to count the number of IL instructions that get executed in a particular method call. Is this possible? Edit I don't mean static analysis of a method body - I'm referring to the actual number of instructions that are executed - so if, for example, the method body includes a loop, the count would be increased by however many instructions make up the loop * the number of times the loop is iterated. I don't think it's possible to do what you want. This is because the IL is only used during JIT (Just

performance for reads of nsdictionary vs nsarray

霸气de小男生 提交于 2019-11-30 22:11:59
Continuing off this post: Performance hit incurred using NSMutableDictionary vs. NSMutableArray> I am trying to run a little test to see if the performance gap is that great for read and writes between NSArray & NSDictionary as well as their mutable coutnerparts... However, I am having difficulties finding a "balanced" test... because the dictionary has 2 (or 3 depending on how you see this) objects to loop through to get the value (not the key) seeked, while the array has only one... Any suggestions? -- If you want more details: What I mean is easier to explain through examples; For the array

data.table time subset vs xts time subset

穿精又带淫゛_ 提交于 2019-11-30 21:31:51
Hi I am looking to subset some minutely data by time. I normally use xts doing something like: subset.string <- 'T10:00/T13:00' xts.min.obj[subset.string] to get all the rows which are between 10am and 1pm (inclusive) EACH DAY and have the output as an xts format. But is a bit slow for my purposes...e.g j <- xts(rnorm(10e6),Sys.time()-(10e6:1)) system.time(j['T10:00/T16:00']) user system elapsed 5.704 0.577 17.115 I know that data.table is v fast and at subsetting large datasets so am wondering if in conjunction with the fasttime package to deal with fast POSIXct creations, if it would be