optimization | 易学教程

MATLAB fast (componentwise) vector operations are…really fast

阅读更多关于 MATLAB fast (componentwise) vector operations are…really fast

问题 I am writing MATLAB scripts since some time and, still, I do not understand how it works "under the hood". Consider the following script, that do some computation using (big) vectors in three different ways: MATLAB vector operations; Simple for cycle that do the same computation component-wise; An optimized cycle that is supposed to be faster than 2. since avoid some allocation and some assignment. Here is the code: N = 10000000; A = linspace(0,100,N); B = linspace(-100,100,N); C = linspace(0

Recursion, memoization and mutable default arguments in Python

阅读更多关于 Recursion, memoization and mutable default arguments in Python

问题 "Base" meaning without just using lru_cache. All of these are "fast enough" -- I'm not looking for the fastest algorithm -- but the timings surprised me so I was hoping I could learn something about how Python "works". Simple loop (/tail recursion): def fibonacci(n): a, b = 0, 1 if n in (a, b): return n for _ in range(n - 1): a, b = b, a + b return b Simple memoized: def fibonacci(n, memo={0:0, 1:1}): if len(memo) <= n: memo[n] = fibonacci(n - 1) + fibonacci(n - 2) return memo[n] Using a

optimizing array loop in c

阅读更多关于 optimizing array loop in c

问题 I have looked online and in my books but I can't seem to get this. I was asked to optimize a small part of a program. Specifically to take an array and add its contents within a small amount of time, with vi and gcc, without using the built-in optimizer. I have tried loop unrolling and a couple of other optimizations meant for products. Can you please help? int length = ARRAY_SIZE; int limit = length-4; for (j=0; j < limit; j+=5) { sum += array[j] + array[j+1] + array[j+2] + array[j+3] +

How can I improve this square root method?

阅读更多关于 How can I improve this square root method?

问题 I know this sounds like a homework assignment, but it isn't. Lately I've been interested in algorithms used to perform certain mathematical operations, such as sine, square root, etc. At the moment, I'm trying to write the Babylonian method of computing square roots in C#. So far, I have this: public static double SquareRoot(double x) { if (x == 0) return 0; double r = x / 2; // this is inefficient, but I can't find a better way // to get a close estimate for the starting value of r double

Will a good C++ compiler optimize a reference away?

阅读更多关于 Will a good C++ compiler optimize a reference away?

问题 I want to write a template function that does something with a std::stack<T> and an instance of T , e.g.: template<class StackType> inline bool some_func( StackType const &s, typename StackType::value_type const &v ) { // ... } The reason I pass v by reference is of course to optimize for the case where StackType::value_type is a struct or class and not copy an entire object by value. However, if StackType::value_type is a "simple" type like int , then it's of course better simply to pass it

Will a good C++ compiler optimize a reference away?

阅读更多关于 Will a good C++ compiler optimize a reference away?

Optimisation tips to find in which triangle a point belongs

阅读更多关于 Optimisation tips to find in which triangle a point belongs

问题 I'm actually having some troubles optimising my algorithm: I have a disk (centered in 0, with radius 1) filled with triangles (not necessarily of same area/length). There could be a HUGE amount of triangle (let's say from 1k to 300k triangles) My goal is to find as quick as possible in which triangle a point belongs. The operation has to be repeated a large amount of time (around 10k times ). For now the algorithm I'm using is: I'm computing the barycentric coordinates of the point in each

run `perf stat` on the output of `perf record`?

阅读更多关于 run `perf stat` on the output of `perf record`?

问题 With perf (the Linux profiler), (v4.15.18), I can run perf stat $COMMAND to get some simple stats on the command. If I run perf record , it saves lots of data to a perf.data file. Can I run perf stat on the output of perf record ? So that I can look at the perf recorded data, but also get a simple overview? 回答1: perf stat uses hardware performance monitoring unit in counting mode, and perf record / perf report with perf.data file uses the same unit in overflow mode. In both modes hardware

run `perf stat` on the output of `perf record`?

阅读更多关于 run `perf stat` on the output of `perf record`?

Optimization techniques used by std::regex_constants::optimize

阅读更多关于 Optimization techniques used by std::regex_constants::optimize

问题 I am working with std::regex , and whilst reading about the various constants defined in std::regex_constants , I came across std::optimize , reading about it, it sounds like it is useful in my application (I only need one instance of the regex, initialized at the beginning, but it is used multiple times throughout the loading process). According to the working paper n3126 (pg. 1077), std::regex_constants::optimize : Specifies that the regular expression engine should pay more attention to