micro-optimization | 易学教程

Speed up double loop in Python

阅读更多关于 Speed up double loop in Python

问题 Is there a way to speed up a double loop that updates its values from the previous iteration? In code: def calc(N, m): x = 1.0 y = 2.0 container = np.zeros((N, 2)) for i in range(N): for j in range(m): x=np.random.gamma(3,1.0/(y*y+4)) y=np.random.normal(1.0/(x+1),1.0/sqrt(x+1)) container[i, 0] = x container[i, 1] = y return container calc(10, 5) As you can see, the inner loop is updating variables x and y while the outer loop starts with a different value of x each time. I don't think this is

PHP null and copy-on-write

阅读更多关于 PHP null and copy-on-write

问题 Suppose I want to have two variables and have them both equal to null . (More realistically, I am thinking about an array that contains a large amount of null s, but the "two variables" scenario is sufficient for the question.) Obviously, I can do this in more than one way. I can do this (method 1): $a = null; $b = $a; By my understanding, the result of this is that there is one zval that is pointed to by two entries in the symbol table: 'a' and 'b' . But alternatively one might do this

Does calling the constructor of an empty class actually use any memory?

阅读更多关于 Does calling the constructor of an empty class actually use any memory?

问题 Suppose I have a class like class Empty{ Empty(int a){ cout << a; } } And then I invoke it using int main(){ Empty(2); return 0; } Will this cause any memory to be allocated on the stack for the creation of an "Empty" object? Obviously, the arguments need to be pushed onto the stack, but I don't want to incur any extra overhead. Basically I am using the constructor as a static member. The reason I want to do this is because of templates. The actual code looks like template <int which> class

Does Rust optimize for loops over calculated ranges?

阅读更多关于 Does Rust optimize for loops over calculated ranges?

问题 As an exercise I'm trying to micro-optimize code in Rust 1.3.0. I have a loop of a loop over an array. Something like this: loop { for i in 0..arr.len() { // something happens here } } Since arrays are fixed size in Rust, will the compiler optimize the code by evaluating arr.len() just once and reusing the value, or will the expression be evaluated with each pass of the top-level loop? The question can be expanded to more calculation-heavy functions without side-effects, other than arr.len()

Dependency chain analysis

阅读更多关于 Dependency chain analysis

问题 From Agner Fog's "Optimizing Assembly" guide, Section 12.7: a loop example. One of the paragraphs discussing the example code: [...] Analysis for Pentium M: ... 13 uops at 3 per clock = one iteration per 4.33c retirement time. There is a dependency chain in the loop. The latencies are: 2 for memory read, 5 for multiplication, 3 for subtraction, and 3 for memory write, which totals 13 clock cycles. This is three times as much as the retirement time but it is not a loop-carried dependence

Optimize css vs Google page speed is messing with me

阅读更多关于 Optimize css vs Google page speed is messing with me

问题 I'm using google page speed and it's telling me my css is inefficient... Very inefficient rules (good to fix on any page): * table.fancy thead td Tag key with 2 descendant selectors and Class overly qualified with tag * table.fancy tfoot td Tag key with 2 descendant selectors and Class overly qualified with tag The css rules are table.fancy {border: 1px solid white; padding:5px} table.fancy td {background:#656165} table.fancy thead td, table.fancy tfoot td {background:#767276} I want the

C++: can I reuse / move an std::list element from middle to end?

阅读更多关于 C++: can I reuse / move an std::list element from middle to end?

问题 I'm optimising constant factors of my LRU-cache implementation, where I use std::unordered_map to store ::iterator s to std::list , which are guaranteed to remain valid even as nearby elements are added or removed. This results in O(n) runtime, so, I'm going after the constant factors. I understand that each iterator is basically a pointer to the structure that holds my stuff. Currently, to move a given element to the back of the linked list, I call l.erase(it) with the iterator, and then

PHP interpreter micro-optimizations in code

阅读更多关于 PHP interpreter micro-optimizations in code

问题 By stumbling on this so thread i decided to write similar test in PHP. My test code is this: // Slow version $t1 = microtime(true); for ($n = 0, $i = 0; $i < 20000000; $i++) { $n += 2 * ($i * $i); } $t2 = microtime(true); echo "n={$n}\n"; // Optimized version $t3 = microtime(true); for ($n = 0, $i = 0; $i < 20000000; $i++) { $n += $i * $i; } $n *= 2; $t4 = microtime(true); echo "n={$n}\n"; $speedup = round(100 * (($t2 - $t1) - ($t4 - $t3)) / ($t2 - $t1), 0); echo "speedup: {$speedup}%\n";

x86-64 Relative jmp performance

阅读更多关于 x86-64 Relative jmp performance

问题 I'm currently doing an assignment that measures the performance of various x86-64 commands (at&t syntax). The command I'm somewhat confused on is the "unconditional jmp" command. This is how I've implemented it: .global uncond uncond: .rept 10000 jmp . + 2 .endr mov $10000, %rax ret It's fairly simple. The code creates a function called "uncond" which uses the .rept directive to call the jmp command 10000 times, then sets the return value to the number of times you called the jmp command. "."

Try/Catch Functions: False versus Falsy

阅读更多关于 Try/Catch Functions: False versus Falsy

问题 This is a petty question that will make you cringe, but I'm still curious. Is it slightly more efficient (for if/then logic) to evaluate an explicit false over falsy values such a null or undefined ? When I try/catch inside of a function, with the intent of recovering from an unimportant (yet) application-halting error, I typically returned undefined when the try portion cannot not succeed. However, below I'm returning false instead of undefined : var parseJSON = function(jsonString) { try {