benchmarking | 易学教程

Benchmarking code - am I doing it right?

阅读更多关于 Benchmarking code - am I doing it right?

问题 I want to benchmark a C/C++ code. I want to measure cpu time, wall time and cycles/byte. I wrote some mesurement functions but have a problem with cycles/byte. To get a cpu time I wrote a function getrusage() with RUSAGE_SELF , for wall time i use clock_gettime with MONOTONIC , to get cycles/byte I use rdtsc . I process an input buffer of size, for example, 1024: char buffer[1024] . How do I benchmark: Do a warm-up phase, simply call fun2measure(args) 1000 times: for(int i=0; i<1000; i++)

Changing the file descriptor size in httperf

阅读更多关于 Changing the file descriptor size in httperf

I'm doing a series of benchmarks and found the httpperf tool. But the version in my ubuntu 12.04 has a too small file descriptor size. Because it warns me with this message: httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE There used to be guide to compile httperf with a bigger size in http://gom-jabbar.org/articles/2009/02/04/httperf-and-file-descriptors but the site is down now. Does anyone knows the steps to compile the tool with the proper settings? I've always followed the instructions here , which should set the global values properly. You can

ActiveRecords select(:id).collect vs. pluck(:id) methods: Why is pure AR “pluck” slower?

阅读更多关于 ActiveRecords select(:id).collect vs. pluck(:id) methods: Why is pure AR “pluck” slower?

I am trying to get all the ids from my Article model. I can do it two ways: Article.select(:id).collect{|a| a.id} Article Load (2.6ms) SELECT "articles"."id" FROM "articles" OR 2.2.1 :006 > Article.pluck(:id) (4.3ms) SELECT "articles"."id" FROM "articles" What gives? Why is the AR slower than the Ruby version? Even when I benchmark the Ruby method, it seems faster: Benchmark.measure{Article.select(:id).collect{|a| a.id}} Article Load (1.9ms) SELECT "articles"."id" FROM "articles" => #<Benchmark::Tms:0x007feb12060658 @label="", @real=0.026455502957105637, @cstime=0.0, @cutime=0.0, @stime=0.0,

cargo test --release causes a stack overflow. Why doesn't cargo bench?

阅读更多关于 cargo test --release causes a stack overflow. Why doesn't cargo bench?

In trying to write an optimized DSP algorithm, I was wondering about relative speed between stack allocation and heap allocation, and size limits of stack-allocated arrays. I realize there is a stack frame size limit, but I don't understand why the following runs, generating seemingly realistic benchmark results with cargo bench , but fails with a stack overflow when run with cargo test --release . #![feature(test)] extern crate test; #[cfg(test)] mod tests { use test::Bencher; #[bench] fn it_works(b: &mut Bencher) { b.iter(|| { let stack = [[[0.0; 2]; 512]; 512]; }); } } kennytm To get things

ActiveRecords select(:id).collect vs. pluck(:id) methods: Why is pure AR “pluck” slower?

阅读更多关于 ActiveRecords select(:id).collect vs. pluck(:id) methods: Why is pure AR “pluck” slower?

问题 I am trying to get all the ids from my Article model. I can do it two ways: Article.select(:id).collect{|a| a.id} Article Load (2.6ms) SELECT "articles"."id" FROM "articles" OR 2.2.1 :006 > Article.pluck(:id) (4.3ms) SELECT "articles"."id" FROM "articles" What gives? Why is the AR slower than the Ruby version? Even when I benchmark the Ruby method, it seems faster: Benchmark.measure{Article.select(:id).collect{|a| a.id}} Article Load (1.9ms) SELECT "articles"."id" FROM "articles" => #

Changing the file descriptor size in httperf

阅读更多关于 Changing the file descriptor size in httperf

问题 I'm doing a series of benchmarks and found the httpperf tool. But the version in my ubuntu 12.04 has a too small file descriptor size. Because it warns me with this message: httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE There used to be guide to compile httperf with a bigger size in http://gom-jabbar.org/articles/2009/02/04/httperf-and-file-descriptors but the site is down now. Does anyone knows the steps to compile the tool with the proper

How to Disable Dynamic Frequency Scaling?

阅读更多关于 How to Disable Dynamic Frequency Scaling?

I would like to do some microbenchmarks, and try to do them right. Unfortunately dynamic frequency scaling makes benchmarking highly unreliable. Is there a way to programmatically (C++, Windows) find out if dynamic frequency scaling is enabled? If, can this be disabled in a program? Ive tried to just use a warmup phase that uses 100% CPU for a second before the actual benchmark takes place, but this turned out to be not reliable either. UPDATE : Even when I disable SpeedStep in the BIOS, cpu-z shows that the frequency changes between 1995 and 2826 GHz In general, you need to do the following

Is there a way to count the number of IL instructions executed?

阅读更多关于 Is there a way to count the number of IL instructions executed?

I want to do some benchmarking of a C# process, but I don't want to use time as my vector - I want to count the number of IL instructions that get executed in a particular method call. Is this possible? Edit I don't mean static analysis of a method body - I'm referring to the actual number of instructions that are executed - so if, for example, the method body includes a loop, the count would be increased by however many instructions make up the loop * the number of times the loop is iterated. I don't think it's possible to do what you want. This is because the IL is only used during JIT (Just

performance for reads of nsdictionary vs nsarray

阅读更多关于 performance for reads of nsdictionary vs nsarray

Continuing off this post: Performance hit incurred using NSMutableDictionary vs. NSMutableArray> I am trying to run a little test to see if the performance gap is that great for read and writes between NSArray & NSDictionary as well as their mutable coutnerparts... However, I am having difficulties finding a "balanced" test... because the dictionary has 2 (or 3 depending on how you see this) objects to loop through to get the value (not the key) seeked, while the array has only one... Any suggestions? -- If you want more details: What I mean is easier to explain through examples; For the array

data.table time subset vs xts time subset

阅读更多关于 data.table time subset vs xts time subset

Hi I am looking to subset some minutely data by time. I normally use xts doing something like: subset.string <- 'T10:00/T13:00' xts.min.obj[subset.string] to get all the rows which are between 10am and 1pm (inclusive) EACH DAY and have the output as an xts format. But is a bit slow for my purposes...e.g j <- xts(rnorm(10e6),Sys.time()-(10e6:1)) system.time(j['T10:00/T16:00']) user system elapsed 5.704 0.577 17.115 I know that data.table is v fast and at subsetting large datasets so am wondering if in conjunction with the fasttime package to deal with fast POSIXct creations, if it would be