O(log N) == O(1) - Why not?

后端 未结 23 1986
广开言路
广开言路 2020-12-22 18:17

Whenever I consider algorithms/data structures I tend to replace the log(N) parts by constants. Oh, I know log(N) diverges - but does it matter in real world applications?

相关标签:
23条回答
  • 2020-12-22 18:56

    The rules of determining the Big-O notation are simpler when you don't decide that O(log n) = O(1).

    As krzysio said, you may accumulate O(log n)s and then they would make a very noticeable difference. Imagine you do a binary search: O(log n) comparisons, and then imagine that each comparison's complexity O(log n). If you neglect both you get O(1) instead of O(log2n). Similarly you may somehow arrive at O(log10n) and then you'll notice a big difference for not too large "n"s.

    0 讨论(0)
  • 2020-12-22 18:57

    Assume that in your entire application, one algorithm accounts for 90% of the time the user waits for the most common operation.

    Suppose in real time the O(1) operation takes a second on your architecture, and the O(logN) operation is basically .5 seconds * log(N). Well, at this point I'd really like to draw you a graph with an arrow at the intersection of the curve and the line, saying, "It matters right here." You want to use the log(N) op for small datasets and the O(1) op for large datasets, in such a scenario.

    Big-O notation and performance optimization is an academic exercise rather than delivering real value to the user for operations that are already cheap, but if it's an expensive operation on a critical path, then you bet it matters!

    0 讨论(0)
  • 2020-12-22 18:58

    Equality, the way you're describing it, is a common abuse of notation.

    To clarify: we usually write f(x) = O(logN) to imply "f(x) is O(logN)".

    At any rate, O(1) means a constant number of steps/time (as an upper bound) to perform an action regardless of how large the input set is. But for O(logN), number of steps/time still grows as a function of the input size (the logarithm of it), it just grows very slowly. For most real world applications you may be safe in assuming that this number of steps will not exceed 100, however I'd bet there are multiple examples of datasets large enough to mark your statement both dangerous and void (packet traces, environmental measurements, and many more).

    0 讨论(0)
  • 2020-12-22 18:58

    For any algorithm that can take inputs of different sizes N, the number of operations it takes is upper-bounded by some function f(N).

    All big-O tells you is the shape of that function.

    • O(1) means there is some number A such that f(N) < A for large N.

    • O(N) means there is some A such that f(N) < AN for large N.

    • O(N^2) means there is some A such that f(N) < AN^2 for large N.

    • O(log(N)) means there is some A such that f(N) < AlogN for large N.

    Big-O says nothing about how big A is (i.e. how fast the algorithm is), or where these functions cross each other. It only says that when you are comparing two algorithms, if their big-Os differ, then there is a value of N (which may be small or it may be very large) where one algorithm will start to outperform the other.

    0 讨论(0)
  • 2020-12-22 18:59

    As many have already said, for the real world, you need to look at the constant factors first, before even worrying about factors of O(log N).

    Then, consider what you will expect N to be. If you have good reason to think that N<10, you can use a linear search instead of a binary one. That's O(N) instead of O(log N), which according to your lights would be significant -- but a linear search that moves found elements to the front may well outperform a more complicated balanced tree, depending on the application.

    On the other hand, note that, even if log N is not likely to exceed 50, a performance factor of 10 is really huge -- if you're compute-bound, a factor like that can easily make or break your application. If that's not enough for you, you'll frequently see factors of (log N)^2 or (logN)^3 in algorithms, so even if you think you can ignore one factor of (log N), that doesn't mean you can ignore more of them.

    Finally, note that the simplex algorithm for linear programming has a worst case performance of O(2^n). However, for practical problems, the worst case never comes up; in practice, the simplex algorithm is fast, relatively simple, and consequently very popular.

    About 30 years ago, someone developed a polynomial-time algorithm for linear programming, but it was not initially practical because the result was too slow.

    Nowadays, there are practical alternative algorithms for linear programming (with polynomial-time wost-case, for what that's worth), which can outperform the simplex method in practice. But, depending on the problem, the simplex method is still competitive.

    0 讨论(0)
提交回复
热议问题