问题
I can think of three operations in C++ that can be described in some sense as having 'constant' complexity. I've seen some debate(*) over what this means, and it seems to me that we could just say "all these operations are constant, but some are more constant than others" :-)
(Edit 2: If you already think you know the answer, please read some of the debate at this question before rushing in too soon: What data structure, exactly, are deques in C++? Many people, with quite high scores, are arguing over what 'constant' means. My question isn't as obvious as you might think!)
(Edit: I don't need a primer in what 'complexity' means. I want a clear answer, perhaps with quotes from the C++ standard, that tells us exactly what is supposed to be constant. Processor ticks, real-world time, or something else? On other threads, some people have argued that time is totally irrelevant to what is required by the C++ standard.)
Do these complexity guarantees in the standard apply to the runtime of the operation? Or do they simply specify the (maximum) number of copies/moves that might happen of the the contained objects? And what exactly does 'amortized' mean?
1. Given a (non-empty) vector<int> v, the following is clearly constant in runtime:
swap(v.front(), v.back());
(although a pedant might point out that it depends on whether the data is in the cache or swapped out or whatever!).
2. Given a list<int> l, doing a push_back is straightforward. Exactly one new item is allocated and a few pointers in the linked list are shuffled. Each push_front involves one allocation, always of the same amount of memory, so this is clearly fairly 'constant'. But, of course, the time to do the allocation could be quite variable. The memory-management might have to take a lot of time to find some suitable free memory.
3. But doing a push_back on a vector<int> is even more unpredictable. Most of the time, it will be very quick, but every now and then it will have to reallocate space for all the data and copy every element to a new location. So it's less predictable in terms of runtime than a single list::push_front, but it's still referred to as constant (amortized). On average, adding a lot of data to a vector will take a complexity that is independent of the amount added, and this is why it's called 'amortized constant' time. (Am I right?)
And finally, I asked about int to avoid the complexities of having another type. For example, a vector< vector<int> > might be a little more complicated to reason about because each element of the vector (of vectors) might have a different size and, for example, swapping two elements is not quite as constant as in case 1. above. But ideally, we could answer for all vector<T>, not just T=int.
(*) For an example of the debates, see the comments to these answers: What data structure, exactly, are deques in C++?
回答1:
Complexity is always stated relative to a specific variable or set of variables. So when the standard talks about constant time insertion, it is talking about constant time relative to the number of items in the list. That is, O(1) insertions means that the number of items currently in the list does not affect the overall complexity of insertions. The list could have 500 or 50000000 items in it, and the complexity of the insertion operation will be the same.
For example, std::list has O(1) insertions and deletions; the complexity of insertions is not affected by the number of elements in the list. However, the memory allocator's complexity may well depend on the number of things that have already been allocated. But since the O(1) is talking about number of items in the list, it doesn't cover that. Nor is it supposed to, because then we would be measuring the complexity of the memory allocator, not the data structure.
In short: that's a different measurement.
It implies that we are free to implement our algorithm as badly as we like, including one where the time isn't really constant in any pragmatic sense, but where we have respected the number of 'operations' on the contained objects.
Complexity is not specified relative to implementations. It is specified relative to algorithms. It doesn't matter that a context could switch, because running time is not what complexity is for.
As above, you could implement std::list with a memory allocator that is O(log(n)) with respect to deletions (where n is the number of allocations). However, the deletion of an element in the list would still be O(1) with respect to the number of items in the list.
Do not confuse complexity with overall performance. The purpose of complexity is to have a general metric for algorithms with respect to different variables. The purpose of a programmer, who wants their code to run fast, is to find a reasonable implementation of an algorithm that conforms to the complexity that they need to achieve that performance.
Complexity is a tool for rating the efficiency of an algorithm. Complexity does not mean you get to stop thinking.
And what exactly does 'amortized' mean?
Amortized means this:
If something is "amortized X time", then as you repeat the operation infinitely many times on the same data structure, the complexity limits at X.
So, std::vector has "amortized constant time" insertions at the rear. So if we take the object and perform infinitely many insertions on it, the asymptotic limit of complexity will be no different than a "constant time" insertion.
In laymans terms, it means that the operation may sometimes be non-constant, but the number of times it will be non-constant will always be decreasing. Over the long run of insertions, it's constant time.
回答2:
if an operation is said to have "constant complexity", first of all it usually refers to time complexity. I could refer to space complexity, but if so that will be explicitly specified normally.
Now, complexity of an operation refers to how much the time to perform the operation will increase when the number of items being dealt with in the operation increases. For a constant complexity operation, the function will take the same amount of time whether there are zero items being dealt with or ten million items.
So:
swap() is constant complexity because it doesn't matter how many items are in the vector, the operation will take the same amount of time.
a push back on the list. is constant complexity because even though it may take some variable amount of time to allocate the new item, that allocation time doesn't increase because the list has 10 million elements (at least not in an algorithmic sense - of course if free memory becomes more and more depleted, allocation might take more time, but from an algorithmic point of view there`s an infinite amount of memory).
push_back() on a vector is said to be 'amortized' constant because in the normal case where a reallocation doesn't have to occur the amount of time the operation will take is unrelated to how many elements are in the vector already - the same amount of time is taken to add a new element to a zero length vector as to a 10 million length vector. However, if the vector does need to be reallocated, a copy of the existing elements needs to occur, and that is not a constant operation - it's a linear operation. But the vector is supposed ot be designed so that those reallocation happen infrequently enough that they can be
amortizedover the course of manypush_back()operations.
,But doing a push_back on a vector is even more unpredictable. Most of the time, it will be very quick, but every now and then it will have to reallocate space for all the data and copy every element to a new location. So it's less predictable in terms of runtime than a single list::push_front, but it's still referred to as constant (amortized). On average, adding a lot of data to a vector will take a complexity that is independent of the amount added, and this is why it's called 'amortized constant' time. (Am I right?)
回答3:
As far as I understand, constant complexity means that the operation is O(1): you can tell in advance how much elementary operations (reads/writes, assembly instructions, whatever) the operation is going to take. And this bound is a common bound for all possible states of the target object. There is a catch here: in a multithreaded environment you cannot predict thread switches, so you can make some reasoning on the elapsed time of operation only in a real-time OS.
About the amortized constant complexity, this is even weaker. Making summary of the answers from here, it's a guarantee that in average your operation is a constant. This means, that the number of elementary operations for N consequent operations is O(N). This means that the number of elementary operations is in about O(1), but allows for some rare jumps. For example, adding an item to the vector's tail is usually constant, but sometimes an additional heavy operation is needed; having amortized constant time here means that the additional operation happens not so often and takes a predictable amount of time, so that the total time of N operation is still O(N). Of course, the same catch applies here as well.
So, answering to your questions:
- Complexity guarantees of standard apply really only to the number of machine code instructions needed to execute the operation, and don't imply that the runtime is somehow bounded. (Indeed, until recently C++ didn't even have a language-specified notion of thread, so from the C++ standard point of view by that time, the program was executed on a dedicated C++-machine.)
- Amortized is "bounded by a constant in average", which usually happens in the case of almost always constant-bounded operation time, with some quite rare deviations.
Edit:
You can look up at the e.g. section 23.1 of C++ standard:
All of the complexity requirements in this clause are stated solely in terms of the number of operations on the contained objects.
回答4:
A complexity of O(1) - constant (time complexity let's say), implies that the time to complete the algorithm is unrelated to the problem size.
So, looking a value up in a hashed structure is O(1) because the time required to do it is independent of how many values it contains. However, the same is not true of a linked list because we must scan through values (the number of which changes as we increase the number of elements) to find our value.
In case 3, when it copies every element it is not and O(1) operation, but an O(N) operation (but most of the time it is O(1), so it is usually constant). Amortization takes this into account, noting that the algorithm usually completes in O(1) time and rarely hits this O(N) case.
回答5:
Constant complexity just means that the time an operation takes to run is independent of the size of the input. It tells you nothing about runtime cost. Amortised constant complexity means that if you perform a sequence of operations, the average time for each individual operation will be independent of the size of the problem instance.
回答6:
One point not yet mentioned with regard to problem complexity is that there's generally an implicit assumption that the machine's word size will be large enough to make solution possible, but no larger. For example, if one is given an array of N numbers in the range 1..N, the machine's word size is assumed to be large enough to handle values in the range 1..N, but there is no guarantee that there will be any "spare" bits available. This is significant, because while the number of bits required to hold the array will of course be O(NlgN), there is a fundamental difference between algorithms which require O(NlgN) total space, and those which require precisely O(1) + N*lg(N) [rounded up] space.
来源:https://stackoverflow.com/questions/8631531/what-does-constant-complexity-really-mean-time-count-of-copies-moves