treap
https://www.e-learn.cn/tag/treap
zh-hansHow does a treap help to update this ordered queue?
https://www.e-learn.cn/topic/2730353
<span>How does a treap help to update this ordered queue?</span>
<span><span lang="" about="https://www.e-learn.cn/user/112" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">回眸只為那壹抹淺笑</span></span>
<span>2019-12-21 20:10:16</span>
<div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>I'm having trouble understanding this solution to a problem on HackerRank. Please see the solution code below, apparently by Kimiyuki Onaka.</p>
<p>The problem is: given a list of unique numbers, and <code>m</code> queries of the type, "<code>move the current ith to jth elements (l,r) to the beginning</code>", return the final arrangement of the numbers.</p>
<p>Onaka suggests that a treap data structure (one that maintains both priority and binary search) can help solve it in <code>O(m log n)</code>. Since I'm not versed in C++, I've tried but failed to conceptualize how a treap could be used. My understanding is that to solve the problem you need <code>log</code> time access to the current <code>ith to jth</code> elements and <code>log</code> time update of the current first element/s and overall order. But I can't see how to conceptualize it.</p>
<p>Ideally, I'd like an explanation in words of how it could be done. Alternatively, just an explanation of what Onaka's code is doing.</p>
<p>Thanks! </p>
<pre class="lang-cpp prettyprint-override"><code>#include <iostream>
#include <tuple>
#include <random>
#include <memory>
#define repeat(i,n) for (int i = 0; (i) < (n); ++(i))
using namespace std;
template <typename T>
struct treap {
typedef T value_type;
typedef double key_type;
value_type v;
key_type k;
shared_ptr<treap> l, r;
size_t m_size;
treap(value_type v)
: v(v)
, k(generate())
, l()
, r()
, m_size(1) {
}
static shared_ptr<treap> update(shared_ptr<treap> const & t) {
if (t) {
t->m_size = 1 + size(t->l) + size(t->r);
}
return t;
}
static key_type generate() {
static random_device device;
static default_random_engine engine(device());
static uniform_real_distribution<double> dist;
return dist(engine);
}
static size_t size(shared_ptr<treap> const & t) {
return t ? t->m_size : 0;
}
static shared_ptr<treap> merge(shared_ptr<treap> const & a, shared_ptr<treap> const & b) { // destructive
if (not a) return b;
if (not b) return a;
if (a->k > b->k) {
a->r = merge(a->r, b);
return update(a);
} else {
b->l = merge(a, b->l);
return update(b);
}
}
static pair<shared_ptr<treap>, shared_ptr<treap> > split(shared_ptr<treap> const & t, size_t i) { // [0, i) [i, n), destructive
if (not t) return { shared_ptr<treap>(), shared_ptr<treap>() };
if (i <= size(t->l)) {
shared_ptr<treap> u; tie(u, t->l) = split(t->l, i);
return { u, update(t) };
} else {
shared_ptr<treap> u; tie(t->r, u) = split(t->r, i - size(t->l) - 1);
return { update(t), u };
}
}
static shared_ptr<treap> insert(shared_ptr<treap> const & t, size_t i, value_type v) { // destructive
shared_ptr<treap> l, r; tie(l, r) = split(t, i);
shared_ptr<treap> u = make_shared<treap>(v);
return merge(merge(l, u), r);
}
static pair<shared_ptr<treap>,shared_ptr<treap> > erase(shared_ptr<treap> const & t, size_t i) { // (t \ t_i, t_t), destructive
shared_ptr<treap> l, u, r;
tie(l, r) = split(t, i+1);
tie(l, u) = split(l, i);
return { merge(l, r), u };
}
};
typedef treap<int> T;
int main() {
int n; cin >> n;
shared_ptr<T> t;
repeat (i,n) {
int a; cin >> a;
t = T::insert(t, i, a);
}
int m; cin >> m;
while (m --) {
int l, r; cin >> l >> r;
-- l;
shared_ptr<T> a, b, c;
tie(a, c) = T::split(t, r);
tie(a, b) = T::split(a, l);
t = T::merge(T::merge(b, a), c);
}
repeat (i,n) {
if (i) cout << ' ';
shared_ptr<T> u;
tie(t, u) = T::erase(t, 0);
cout << u->v;
}
cout << endl;
return 0;
}
</code></pre>
<br /><h3>回答1:</h3><br /><p>Perhaps some pictures of the data structure as it processes the sample input would be helpful.</p>
<p>First, the six numbers "1 2 3 4 5 6" are inserted into the treap. Each one is associated with a randomly generated double, which determines if it goes above or below other nodes. The treap is always ordered so that all of a node's left children come before it, and all its right children come after.</p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p></p>
<p>Then we start moving intervals to the beginning. The treap is split into three parts—one with the first l-1 nodes, one with the nodes in the interval, and the last nodes. Then they are re-merged in a different order.</p>
<p>First, the interval [4,5] is moved:
</p>
<p>Now the treap's order is 4, 5, 1, 2, 3, 6. (The root 4 comes first, because it has no left child; 3 is preceded by its left child 2, which is preceded by its own left child 5; then comes 5's right child 1; then 2, then 3, then 6.) The nodes keep track of the size of each subtree (<code>m_size</code>).</p>
<p>Given [3,4], we first call <code>split(t,4)</code>, which should return a pair: one treap with the first 4 elements, and another one with the rest.</p>
<p>The root node (4) does not have 4 things under its left subtree, so it recurses with <code>split(t->r, 3)</code>.
This node (3) does have 3 things under its left subtree, so it
calls <code>split(t->l, 3)</code>.
Now we are at node (2). It calls <code>split(t->r, 0)</code>,
but it does not have a right child, so this returns a pair of null pointers.
Thus from node (2) we return the unchanged subtree from (2), and a nullptr.
Propagating up, node (3) sets its left child to null, and returns
the subtree from (2), and the subtree at (3) itself (which is now just two elements, (3) and (6).)
Finally, at node (4) we set the right subchild to (2), and return the tree at (4) (which now has four elements, as required) and the two-element tree rooted at (3).</p>
<p>Next a call is made to <code>split(a,2)</code>, where <code>a</code> is the first, four-element, tree from the last call.</p>
<p>Again, the root (4) has no left child, so we recurse with <code>split(t->r, 1)</code>.</p>
<p>The node (2) has a left subtree with size 2, so it calls <code>split(t->l, 1)</code>.</p>
<p>The node (5) has no left child, so it calls <code>split(t->r, 0)</code>.</p>
<p>At the leaf (1), <code>0 <= size(t->l)</code> is vacuously true: it gets a pair of null pointers from <code>split(t->l, 0)</code> and returns a pair(null, (1)).</p>
<p>Up at (5), we set the right child to null, and return a pair((5), (1)).</p>
<p>Up at (2), we set the left child to (1), and return a pair((5), (2)->(1)).</p>
<p>Finally, at (4), we set the right child to (5), and return a pair((4)->(5), (2)->(1)).</p>
<p></p>
<p>Finally the interval [2,3] (consisting of the elements 2 and 4) is moved:
</p>
<p>Finally the nodes are popped in order, yielding 2, 4, 1, 5, 3, 6.</p>
<p>Perhaps you'd like to see the tree states given different input. I put a copy of the treap code, "instrumented" to produce the pictures, on GitHub. When run, it produces a file trees.tex; then running <code>pdflatex trees</code> produces pictures like those above.
(Or if you like, I'd be happy to make pictures for different input: that would be easier than installing a whole TeX distribution if you don't have it.)</p>
<br /><br /><p>来源：<code>https://stackoverflow.com/questions/37685498/how-does-a-treap-help-to-update-this-ordered-queue</code></p></div>
<div class="field field--name-field-tags field--type-entity-reference field--label-above">
<div class="field--label">标签</div>
<div class="field--items">
<div class="field--item"><a href="https://www.e-learn.cn/tag/c-0" hreflang="zh-hans">c++</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/algorithm" hreflang="zh-hans">algorithm</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/treap" hreflang="zh-hans">treap</a></div>
</div>
</div>
Sat, 21 Dec 2019 12:10:16 +0000回眸只為那壹抹淺笑2730353 at https://www.e-learn.cnWhy is insertion into my tree faster on sorted input than random input?
https://www.e-learn.cn/topic/1542011
<span>Why is insertion into my tree faster on sorted input than random input?</span>
<span><span lang="" about="https://www.e-learn.cn/user/239" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">旧城冷巷雨未停</span></span>
<span>2019-12-04 08:11:15</span>
<div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>Now I've always heard binary search trees are faster to build from randomly selected data than ordered data, simply because ordered data requires explicit rebalancing to keep the tree height at a minimum.</p>
<p>Recently I implemented an immutable treap, a special kind of binary search tree which uses randomization to keep itself relatively balanced. In contrast to what I expected, I found I can consistently build a treap about 2x faster and generally better balanced from ordered data than unordered data -- and I have no idea why.</p>
<p>Here's my treap implementation:</p>
<ul><li>http://pastebin.com/VAfSJRwZ</li>
</ul><p>And here's a test program:</p>
<pre><code>using System;
using System.Collections.Generic;
using System.Linq;
using System.Diagnostics;
namespace ConsoleApplication1
{
class Program
{
static Random rnd = new Random();
const int ITERATION_COUNT = 20;
static void Main(string[] args)
{
List<double> rndTimes = new List<double>();
List<double> orderedTimes = new List<double>();
rndTimes.Add(TimeIt(50, RandomInsert));
rndTimes.Add(TimeIt(100, RandomInsert));
rndTimes.Add(TimeIt(200, RandomInsert));
rndTimes.Add(TimeIt(400, RandomInsert));
rndTimes.Add(TimeIt(800, RandomInsert));
rndTimes.Add(TimeIt(1000, RandomInsert));
rndTimes.Add(TimeIt(2000, RandomInsert));
rndTimes.Add(TimeIt(4000, RandomInsert));
rndTimes.Add(TimeIt(8000, RandomInsert));
rndTimes.Add(TimeIt(16000, RandomInsert));
rndTimes.Add(TimeIt(32000, RandomInsert));
rndTimes.Add(TimeIt(64000, RandomInsert));
rndTimes.Add(TimeIt(128000, RandomInsert));
string rndTimesAsString = string.Join("\n", rndTimes.Select(x => x.ToString()).ToArray());
orderedTimes.Add(TimeIt(50, OrderedInsert));
orderedTimes.Add(TimeIt(100, OrderedInsert));
orderedTimes.Add(TimeIt(200, OrderedInsert));
orderedTimes.Add(TimeIt(400, OrderedInsert));
orderedTimes.Add(TimeIt(800, OrderedInsert));
orderedTimes.Add(TimeIt(1000, OrderedInsert));
orderedTimes.Add(TimeIt(2000, OrderedInsert));
orderedTimes.Add(TimeIt(4000, OrderedInsert));
orderedTimes.Add(TimeIt(8000, OrderedInsert));
orderedTimes.Add(TimeIt(16000, OrderedInsert));
orderedTimes.Add(TimeIt(32000, OrderedInsert));
orderedTimes.Add(TimeIt(64000, OrderedInsert));
orderedTimes.Add(TimeIt(128000, OrderedInsert));
string orderedTimesAsString = string.Join("\n", orderedTimes.Select(x => x.ToString()).ToArray());
Console.WriteLine("Done");
}
static double TimeIt(int insertCount, Action<int> f)
{
Console.WriteLine("TimeIt({0}, {1})", insertCount, f.Method.Name);
List<double> times = new List<double>();
for (int i = 0; i < ITERATION_COUNT; i++)
{
Stopwatch sw = Stopwatch.StartNew();
f(insertCount);
sw.Stop();
times.Add(sw.Elapsed.TotalMilliseconds);
}
return times.Average();
}
static void RandomInsert(int insertCount)
{
Treap<double> tree = new Treap<double>((x, y) => x.CompareTo(y));
for (int i = 0; i < insertCount; i++)
{
tree = tree.Insert(rnd.NextDouble());
}
}
static void OrderedInsert(int insertCount)
{
Treap<double> tree = new Treap<double>((x, y) => x.CompareTo(y));
for(int i = 0; i < insertCount; i++)
{
tree = tree.Insert(i + rnd.NextDouble());
}
}
}
}
</code></pre>
<p>And here's a chart comparing random and ordered insertion times in milliseconds:</p>
<pre><code>Insertions Random Ordered RandomTime / OrderedTime
50 1.031665 0.261585 3.94
100 0.544345 1.377155 0.4
200 1.268320 0.734570 1.73
400 2.765555 1.639150 1.69
800 6.089700 3.558350 1.71
1000 7.855150 4.704190 1.67
2000 17.852000 12.554065 1.42
4000 40.157340 22.474445 1.79
8000 88.375430 48.364265 1.83
16000 197.524000 109.082200 1.81
32000 459.277050 238.154405 1.93
64000 1055.508875 512.020310 2.06
128000 2481.694230 1107.980425 2.24
</code></pre>
<p>I don't see anything in the code which makes ordered input asymptotically faster than unordered input, so I'm at a loss to explain the difference.</p>
<p><strong>Why is it so much faster to build a treap from ordered input than random input?</strong></p>
<br /><h3>回答1:</h3><br /><p>Self-balancing trees exist to <em>fix</em> the problems associated non-randomly-distributed data. By definition, they trade away a bit of the best-case performance to vastly improve the worst-case performance associated with non-balanced BSTs, specifically that of sorted input.</p>
<p>You're actually overthinking this problem, because slower insertion of random data vs. ordered data is a characteristic of <em>any</em> balanced tree. Try it on an AVL and you'll see the same results.</p>
<p>Cameron had the right idea, removing the priority check to force the worst case. If you do that and instrument your tree so you can see how many rebalances are happening for each insert, it actually becomes very obvious what's going on. When inserting sorted data, the tree always rotates left and the root's right child is always empty. Insertion always results in exactly one rebalance because the insertion node has no children and no recursion occurs. On the other hand, when you run it on the random data, almost immediately you start to see multiple rebalances happening on every insert, as many as 5 or 6 of them in the smallest case (50 inserts), because it's happening on subtrees as well.</p>
<p>With priority checking turned back on, not only are rebalances typically <em>less expensive</em> due to more nodes being pushed into the left subtree (where they never come out of because no insertions happen there), but they are also <em>less likely</em> to occur. Why? Because in the treap, high-priority nodes float to the top, and the constant left-rotations (not accompanied by right-rotations) start to push all the high-priority nodes into the left subtree as well. The result is that rebalances happen less frequently due to the uneven distribution of probability.</p>
<p>If you instrument the rebalancing code you'll see that this is true; for both the sorted and random input, you end up with almost identical numbers of left-rotations, but the random input also gives the same number of right-rotations, which makes for twice as many in all. This shouldn't be surprising - Gaussian input should result in a Gaussian distribution of rotations. You'll also see that there are only about 60-70% as many top-level rebalances for the sorted input, which perhaps <em>is</em> surprising, and again, that's due to the sorted input messing with the natural distribution of priorities.</p>
<p>You can also verify this by inspecting the full tree at the end of an insertion loop. With the random input, priorities tend to decrease fairly linearly by level; with the sorted input, priorities tend to stay very high until you get to one or two levels from the bottom.</p>
<p>Hopefully I've done a decent job explaining this... let me know if any of it is too vague.</p>
<br /><br /><br /><h3>回答2:</h3><br /><p>I ran your code, and I think it has to do with the number of rotations. During ordered input, the number of rotations are optimal, and the tree will never have to rotate back.</p>
<p>During random input the tree will have to perform more rotations, because it may have to rotate back and forth.</p>
<p>To really find out, I would have to add counters for the numbers of left and right rotations for each run. You can probably do this yourself. </p>
<p>UPDATE:</p>
<p>I put breakpoints on rotateleft and rotateright. During ordered input rotateright is never used. During random input, both are hit, and it seems to me that they are used more frequently.</p>
<p>UPDATE 2:</p>
<p>I added some output to the 50 item ordered run (substituting with integers for clarity), to learn more:</p>
<pre><code>TimeIt(50, OrderedInsert)
LastValue = 0, Top.Value = 0, Right.Count = 0, Left.Count = 0
RotateLeft @value=0
LastValue = 1, Top.Value = 1, Right.Count = 0, Left.Count = 1
LastValue = 2, Top.Value = 1, Right.Count = 1, Left.Count = 1
LastValue = 3, Top.Value = 1, Right.Count = 2, Left.Count = 1
RotateLeft @value=3
RotateLeft @value=2
RotateLeft @value=1
LastValue = 4, Top.Value = 4, Right.Count = 0, Left.Count = 4
LastValue = 5, Top.Value = 4, Right.Count = 1, Left.Count = 4
LastValue = 6, Top.Value = 4, Right.Count = 2, Left.Count = 4
RotateLeft @value=6
LastValue = 7, Top.Value = 4, Right.Count = 3, Left.Count = 4
LastValue = 8, Top.Value = 4, Right.Count = 4, Left.Count = 4
RotateLeft @value=8
RotateLeft @value=7
LastValue = 9, Top.Value = 4, Right.Count = 5, Left.Count = 4
LastValue = 10, Top.Value = 4, Right.Count = 6, Left.Count = 4
RotateLeft @value=10
RotateLeft @value=9
RotateLeft @value=5
RotateLeft @value=4
LastValue = 11, Top.Value = 11, Right.Count = 0, Left.Count = 11
LastValue = 12, Top.Value = 11, Right.Count = 1, Left.Count = 11
RotateLeft @value=12
LastValue = 13, Top.Value = 11, Right.Count = 2, Left.Count = 11
RotateLeft @value=13
LastValue = 14, Top.Value = 11, Right.Count = 3, Left.Count = 11
LastValue = 15, Top.Value = 11, Right.Count = 4, Left.Count = 11
RotateLeft @value=15
RotateLeft @value=14
LastValue = 16, Top.Value = 11, Right.Count = 5, Left.Count = 11
LastValue = 17, Top.Value = 11, Right.Count = 6, Left.Count = 11
RotateLeft @value=17
LastValue = 18, Top.Value = 11, Right.Count = 7, Left.Count = 11
LastValue = 19, Top.Value = 11, Right.Count = 8, Left.Count = 11
RotateLeft @value=19
LastValue = 20, Top.Value = 11, Right.Count = 9, Left.Count = 11
LastValue = 21, Top.Value = 11, Right.Count = 10, Left.Count = 11
RotateLeft @value=21
LastValue = 22, Top.Value = 11, Right.Count = 11, Left.Count = 11
RotateLeft @value=22
RotateLeft @value=20
RotateLeft @value=18
LastValue = 23, Top.Value = 11, Right.Count = 12, Left.Count = 11
LastValue = 24, Top.Value = 11, Right.Count = 13, Left.Count = 11
LastValue = 25, Top.Value = 11, Right.Count = 14, Left.Count = 11
RotateLeft @value=25
RotateLeft @value=24
LastValue = 26, Top.Value = 11, Right.Count = 15, Left.Count = 11
LastValue = 27, Top.Value = 11, Right.Count = 16, Left.Count = 11
RotateLeft @value=27
LastValue = 28, Top.Value = 11, Right.Count = 17, Left.Count = 11
RotateLeft @value=28
RotateLeft @value=26
RotateLeft @value=23
RotateLeft @value=16
RotateLeft @value=11
LastValue = 29, Top.Value = 29, Right.Count = 0, Left.Count = 29
LastValue = 30, Top.Value = 29, Right.Count = 1, Left.Count = 29
LastValue = 31, Top.Value = 29, Right.Count = 2, Left.Count = 29
LastValue = 32, Top.Value = 29, Right.Count = 3, Left.Count = 29
RotateLeft @value=32
RotateLeft @value=31
LastValue = 33, Top.Value = 29, Right.Count = 4, Left.Count = 29
RotateLeft @value=33
RotateLeft @value=30
LastValue = 34, Top.Value = 29, Right.Count = 5, Left.Count = 29
RotateLeft @value=34
LastValue = 35, Top.Value = 29, Right.Count = 6, Left.Count = 29
LastValue = 36, Top.Value = 29, Right.Count = 7, Left.Count = 29
LastValue = 37, Top.Value = 29, Right.Count = 8, Left.Count = 29
RotateLeft @value=37
LastValue = 38, Top.Value = 29, Right.Count = 9, Left.Count = 29
LastValue = 39, Top.Value = 29, Right.Count = 10, Left.Count = 29
RotateLeft @value=39
LastValue = 40, Top.Value = 29, Right.Count = 11, Left.Count = 29
RotateLeft @value=40
RotateLeft @value=38
RotateLeft @value=36
LastValue = 41, Top.Value = 29, Right.Count = 12, Left.Count = 29
LastValue = 42, Top.Value = 29, Right.Count = 13, Left.Count = 29
RotateLeft @value=42
LastValue = 43, Top.Value = 29, Right.Count = 14, Left.Count = 29
LastValue = 44, Top.Value = 29, Right.Count = 15, Left.Count = 29
RotateLeft @value=44
LastValue = 45, Top.Value = 29, Right.Count = 16, Left.Count = 29
LastValue = 46, Top.Value = 29, Right.Count = 17, Left.Count = 29
RotateLeft @value=46
RotateLeft @value=45
LastValue = 47, Top.Value = 29, Right.Count = 18, Left.Count = 29
LastValue = 48, Top.Value = 29, Right.Count = 19, Left.Count = 29
LastValue = 49, Top.Value = 29, Right.Count = 20, Left.Count = 29
</code></pre>
<p>The ordered items always gets added to the right side of the tree, naturally. When the right side gets bigger than the left, a rotateleft happens. Rotateright never happens. A new top node is selected roughly every time the tree doubles. The randomness of the priority value jitters it a little, so it goes 0, 1, 4, 11, 29 in this run.</p>
<p>A random run reveals something interesting:</p>
<pre><code>TimeIt(50, RandomInsert)
LastValue = 0,748661640914465, Top.Value = 0,748661640914465, Right.Count = 0, Left.Count = 0
LastValue = 0,669427539533669, Top.Value = 0,748661640914465, Right.Count = 0, Left.Count = 1
RotateRight @value=0,669427539533669
LastValue = 0,318363281115127, Top.Value = 0,748661640914465, Right.Count = 0, Left.Count = 2
RotateRight @value=0,669427539533669
LastValue = 0,33133987678743, Top.Value = 0,748661640914465, Right.Count = 0, Left.Count = 3
RotateLeft @value=0,748661640914465
LastValue = 0,955126694382693, Top.Value = 0,955126694382693, Right.Count = 0, Left.Count = 4
RotateRight @value=0,669427539533669
RotateLeft @value=0,33133987678743
RotateLeft @value=0,318363281115127
RotateRight @value=0,748661640914465
RotateRight @value=0,955126694382693
LastValue = 0,641024029180884, Top.Value = 0,641024029180884, Right.Count = 3, Left.Count = 2
LastValue = 0,20709771951991, Top.Value = 0,641024029180884, Right.Count = 3, Left.Count = 3
LastValue = 0,830862050331599, Top.Value = 0,641024029180884, Right.Count = 4, Left.Count = 3
RotateRight @value=0,20709771951991
RotateRight @value=0,318363281115127
LastValue = 0,203250563798123, Top.Value = 0,641024029180884, Right.Count = 4, Left.Count = 4
RotateLeft @value=0,669427539533669
RotateRight @value=0,748661640914465
RotateRight @value=0,955126694382693
LastValue = 0,701743399585478, Top.Value = 0,641024029180884, Right.Count = 5, Left.Count = 4
RotateLeft @value=0,669427539533669
RotateRight @value=0,701743399585478
RotateLeft @value=0,641024029180884
LastValue = 0,675667521858433, Top.Value = 0,675667521858433, Right.Count = 4, Left.Count = 6
RotateLeft @value=0,33133987678743
RotateLeft @value=0,318363281115127
RotateLeft @value=0,203250563798123
LastValue = 0,531275219531392, Top.Value = 0,675667521858433, Right.Count = 4, Left.Count = 7
RotateRight @value=0,748661640914465
RotateRight @value=0,955126694382693
RotateLeft @value=0,701743399585478
LastValue = 0,704049674190604, Top.Value = 0,675667521858433, Right.Count = 5, Left.Count = 7
RotateRight @value=0,203250563798123
RotateRight @value=0,531275219531392
RotateRight @value=0,641024029180884
RotateRight @value=0,675667521858433
LastValue = 0,161392807104342, Top.Value = 0,161392807104342, Right.Count = 13, Left.Count = 0
RotateRight @value=0,203250563798123
RotateRight @value=0,531275219531392
RotateRight @value=0,641024029180884
RotateRight @value=0,675667521858433
RotateLeft @value=0,161392807104342
LastValue = 0,167598206162266, Top.Value = 0,167598206162266, Right.Count = 13, Left.Count = 1
LastValue = 0,154996359793002, Top.Value = 0,167598206162266, Right.Count = 13, Left.Count = 2
RotateLeft @value=0,33133987678743
LastValue = 0,431767346538495, Top.Value = 0,167598206162266, Right.Count = 14, Left.Count = 2
RotateRight @value=0,203250563798123
RotateRight @value=0,531275219531392
RotateRight @value=0,641024029180884
RotateRight @value=0,675667521858433
RotateLeft @value=0,167598206162266
LastValue = 0,173774613614089, Top.Value = 0,173774613614089, Right.Count = 14, Left.Count = 3
RotateRight @value=0,830862050331599
LastValue = 0,76559642412029, Top.Value = 0,173774613614089, Right.Count = 15, Left.Count = 3
RotateRight @value=0,76559642412029
RotateLeft @value=0,748661640914465
RotateRight @value=0,955126694382693
RotateLeft @value=0,704049674190604
RotateLeft @value=0,675667521858433
LastValue = 0,75742144871383, Top.Value = 0,173774613614089, Right.Count = 16, Left.Count = 3
LastValue = 0,346844367844446, Top.Value = 0,173774613614089, Right.Count = 17, Left.Count = 3
RotateRight @value=0,830862050331599
LastValue = 0,787565814232251, Top.Value = 0,173774613614089, Right.Count = 18, Left.Count = 3
LastValue = 0,734950566540915, Top.Value = 0,173774613614089, Right.Count = 19, Left.Count = 3
RotateLeft @value=0,20709771951991
RotateRight @value=0,318363281115127
RotateLeft @value=0,203250563798123
RotateRight @value=0,531275219531392
RotateRight @value=0,641024029180884
RotateRight @value=0,675667521858433
RotateRight @value=0,75742144871383
RotateLeft @value=0,173774613614089
LastValue = 0,236504829598826, Top.Value = 0,236504829598826, Right.Count = 17, Left.Count = 6
RotateLeft @value=0,830862050331599
RotateLeft @value=0,787565814232251
RotateLeft @value=0,76559642412029
RotateRight @value=0,955126694382693
LastValue = 0,895606500048007, Top.Value = 0,236504829598826, Right.Count = 18, Left.Count = 6
LastValue = 0,599106418713511, Top.Value = 0,236504829598826, Right.Count = 19, Left.Count = 6
LastValue = 0,8182332901369, Top.Value = 0,236504829598826, Right.Count = 20, Left.Count = 6
RotateRight @value=0,734950566540915
LastValue = 0,704216948572647, Top.Value = 0,236504829598826, Right.Count = 21, Left.Count = 6
RotateLeft @value=0,346844367844446
RotateLeft @value=0,33133987678743
RotateRight @value=0,431767346538495
RotateLeft @value=0,318363281115127
RotateRight @value=0,531275219531392
RotateRight @value=0,641024029180884
RotateRight @value=0,675667521858433
RotateRight @value=0,75742144871383
LastValue = 0,379157059536854, Top.Value = 0,236504829598826, Right.Count = 22, Left.Count = 6
RotateLeft @value=0,431767346538495
LastValue = 0,46832062046431, Top.Value = 0,236504829598826, Right.Count = 23, Left.Count = 6
RotateRight @value=0,154996359793002
LastValue = 0,0999000217299443, Top.Value = 0,236504829598826, Right.Count = 23, Left.Count = 7
RotateLeft @value=0,20709771951991
LastValue = 0,229543754006524, Top.Value = 0,236504829598826, Right.Count = 23, Left.Count = 8
RotateRight @value=0,8182332901369
LastValue = 0,80358425984326, Top.Value = 0,236504829598826, Right.Count = 24, Left.Count = 8
RotateRight @value=0,318363281115127
LastValue = 0,259324726769386, Top.Value = 0,236504829598826, Right.Count = 25, Left.Count = 8
RotateRight @value=0,318363281115127
LastValue = 0,307835293145774, Top.Value = 0,236504829598826, Right.Count = 26, Left.Count = 8
RotateLeft @value=0,431767346538495
LastValue = 0,453910283024381, Top.Value = 0,236504829598826, Right.Count = 27, Left.Count = 8
RotateLeft @value=0,830862050331599
LastValue = 0,868997387527021, Top.Value = 0,236504829598826, Right.Count = 28, Left.Count = 8
RotateLeft @value=0,20709771951991
RotateRight @value=0,229543754006524
RotateLeft @value=0,203250563798123
LastValue = 0,218358597354199, Top.Value = 0,236504829598826, Right.Count = 28, Left.Count = 9
RotateRight @value=0,0999000217299443
RotateRight @value=0,161392807104342
LastValue = 0,0642934488431986, Top.Value = 0,236504829598826, Right.Count = 28, Left.Count = 10
RotateRight @value=0,154996359793002
RotateLeft @value=0,0999000217299443
LastValue = 0,148295871982489, Top.Value = 0,236504829598826, Right.Count = 28, Left.Count = 11
LastValue = 0,217621828065078, Top.Value = 0,236504829598826, Right.Count = 28, Left.Count = 12
RotateRight @value=0,599106418713511
LastValue = 0,553135806020878, Top.Value = 0,236504829598826, Right.Count = 29, Left.Count = 12
LastValue = 0,982277666210326, Top.Value = 0,236504829598826, Right.Count = 30, Left.Count = 12
RotateRight @value=0,8182332901369
LastValue = 0,803671114520948, Top.Value = 0,236504829598826, Right.Count = 31, Left.Count = 12
RotateRight @value=0,203250563798123
RotateRight @value=0,218358597354199
LastValue = 0,19310415405459, Top.Value = 0,236504829598826, Right.Count = 31, Left.Count = 13
LastValue = 0,0133136604043253, Top.Value = 0,236504829598826, Right.Count = 31, Left.Count = 14
RotateLeft @value=0,46832062046431
RotateRight @value=0,531275219531392
RotateRight @value=0,641024029180884
RotateRight @value=0,675667521858433
RotateRight @value=0,75742144871383
LastValue = 0,483394719419719, Top.Value = 0,236504829598826, Right.Count = 32, Left.Count = 14
RotateLeft @value=0,431767346538495
RotateRight @value=0,453910283024381
LastValue = 0,453370328738061, Top.Value = 0,236504829598826, Right.Count = 33, Left.Count = 14
LastValue = 0,762330518459124, Top.Value = 0,236504829598826, Right.Count = 34, Left.Count = 14
LastValue = 0,699010426969738, Top.Value = 0,236504829598826, Right.Count = 35, Left.Count = 14
</code></pre>
<p>Rotations happen not so much because the tree is unbalanced, but because of the priorities, which are randomly selected. For example we get 4 rotations at the 13th insertion. We have a tree balanced at 5/7 (which is fine), but get to 13/0!
It would seem that the use of random priorities deserves further investigation. Anyhow, it is plain to see that the random inserts cause a lot more rotations, than the ordered inserts.</p>
<br /><br /><br /><h3>回答3:</h3><br /><p>I added calculation of the standard deviation, and changed your test to run at the highest priority (to reduce noise as much as possible). This are the results:</p>
<pre><code>Random Ordered
0,2835 (stddev 0,9946) 0,0891 (stddev 0,2372)
0,1230 (stddev 0,0086) 0,0780 (stddev 0,0031)
0,2498 (stddev 0,0662) 0,1694 (stddev 0,0145)
0,5136 (stddev 0,0441) 0,3550 (stddev 0,0658)
1,1704 (stddev 0,1072) 0,6632 (stddev 0,0856)
1,4672 (stddev 0,1090) 0,8343 (stddev 0,1047)
3,3330 (stddev 0,2041) 1,9272 (stddev 0,3456)
7,9822 (stddev 0,3906) 3,7871 (stddev 0,1459)
18,4300 (stddev 0,6112) 10,3233 (stddev 2,0247)
44,9500 (stddev 2,2935) 22,3870 (stddev 1,7157)
110,5275 (stddev 3,7129) 49,4085 (stddev 2,9595)
275,4345 (stddev 10,7154) 107,8442 (stddev 8,6200)
667,7310 (stddev 20,0729) 242,9779 (stddev 14,4033)
</code></pre>
<p>I've ran a sampling profiler and here are the results (amount of times the program was in this method):</p>
<pre><code>Method Random Ordered
HeapifyRight() 1.95 5.33
get_IsEmpty() 3.16 5.49
Make() 3.28 4.92
Insert() 16.01 14.45
HeapifyLeft() 2.20 0.00
</code></pre>
<p>Conclusion: the random has a fairly reasonable distribution between left and right rotation, while the ordered never rotates left.</p>
<p>Here is my improved "benchmark" program:</p>
<pre><code> static void Main(string[] args)
{
Thread.CurrentThread.Priority = ThreadPriority.Highest;
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.RealTime;
List<String> rndTimes = new List<String>();
List<String> orderedTimes = new List<String>();
rndTimes.Add(TimeIt(50, RandomInsert));
rndTimes.Add(TimeIt(100, RandomInsert));
rndTimes.Add(TimeIt(200, RandomInsert));
rndTimes.Add(TimeIt(400, RandomInsert));
rndTimes.Add(TimeIt(800, RandomInsert));
rndTimes.Add(TimeIt(1000, RandomInsert));
rndTimes.Add(TimeIt(2000, RandomInsert));
rndTimes.Add(TimeIt(4000, RandomInsert));
rndTimes.Add(TimeIt(8000, RandomInsert));
rndTimes.Add(TimeIt(16000, RandomInsert));
rndTimes.Add(TimeIt(32000, RandomInsert));
rndTimes.Add(TimeIt(64000, RandomInsert));
rndTimes.Add(TimeIt(128000, RandomInsert));
orderedTimes.Add(TimeIt(50, OrderedInsert));
orderedTimes.Add(TimeIt(100, OrderedInsert));
orderedTimes.Add(TimeIt(200, OrderedInsert));
orderedTimes.Add(TimeIt(400, OrderedInsert));
orderedTimes.Add(TimeIt(800, OrderedInsert));
orderedTimes.Add(TimeIt(1000, OrderedInsert));
orderedTimes.Add(TimeIt(2000, OrderedInsert));
orderedTimes.Add(TimeIt(4000, OrderedInsert));
orderedTimes.Add(TimeIt(8000, OrderedInsert));
orderedTimes.Add(TimeIt(16000, OrderedInsert));
orderedTimes.Add(TimeIt(32000, OrderedInsert));
orderedTimes.Add(TimeIt(64000, OrderedInsert));
orderedTimes.Add(TimeIt(128000, OrderedInsert));
var result = string.Join("\n", (from s in rndTimes
join s2 in orderedTimes
on rndTimes.IndexOf(s) equals orderedTimes.IndexOf(s2)
select String.Format("{0} \t\t {1}", s, s2)).ToArray());
Console.WriteLine(result);
Console.WriteLine("Done");
Console.ReadLine();
}
static double StandardDeviation(List<double> doubleList)
{
double average = doubleList.Average();
double sumOfDerivation = 0;
foreach (double value in doubleList)
{
sumOfDerivation += (value) * (value);
}
double sumOfDerivationAverage = sumOfDerivation / doubleList.Count;
return Math.Sqrt(sumOfDerivationAverage - (average * average));
}
static String TimeIt(int insertCount, Action<int> f)
{
Console.WriteLine("TimeIt({0}, {1})", insertCount, f.Method.Name);
List<double> times = new List<double>();
for (int i = 0; i < ITERATION_COUNT; i++)
{
Stopwatch sw = Stopwatch.StartNew();
f(insertCount);
sw.Stop();
times.Add(sw.Elapsed.TotalMilliseconds);
}
return String.Format("{0:f4} (stddev {1:f4})", times.Average(), StandardDeviation(times));
}
</code></pre>
<br /><br /><br /><h3>回答4:</h3><br /><p>Yes it's the number of rotations that is causing the extra time. Here's what I did:</p>
<ul><li>Remove the lines checking priority in <code>HeapifyLeft</code> and <code>HeapifyRight</code> so rotations are always done.</li>
<li>Added a <code>Console.WriteLine</code> after the if in <code>RotateLeft</code> and <code>RotateRight</code>.</li>
<li>Added a <code>Console.WriteLine</code> in the <code>IsEmpty</code> part of the <code>Insert</code> method to see what was being inserted.</li>
<li>Ran the test once with 5 values each.</li>
</ul><p>Output:</p>
<pre><code>TimeIt(5, RandomInsert)
Inserting 0.593302943554382
Inserting 0.348900582338171
RotateRight
Inserting 0.75496212381635
RotateLeft
RotateLeft
Inserting 0.438848891499848
RotateRight
RotateLeft
RotateRight
Inserting 0.357057290783644
RotateLeft
RotateRight
TimeIt(5, OrderedInsert)
Inserting 0.150707998383189
Inserting 1.58281302712057
RotateLeft
Inserting 2.23192588297274
RotateLeft
Inserting 3.30518679009061
RotateLeft
Inserting 4.32788012657682
RotateLeft
</code></pre>
<p>Result: 2 times as many rotations on random data.</p>
<br /><br /><br /><h3>回答5:</h3><br /><p>You're only seeing a difference of about 2x. Unless you've tuned the daylights out of this code, that's basically in the noise. Most well-written programs, especially those involving data structure, can easily have more room for improvement than that. Here's an example.</p>
<p>I just ran your code and took a few stackshots. Here's what I saw:</p>
<p>Random Insert:</p>
<pre><code>1 Insert:64 -> HeapifyLeft:81 -> RotateRight:150
1 Insert:64 -> Make:43 ->Treap:35
1 Insert:68 -> Make:43
</code></pre>
<p>Ordered Insert:</p>
<pre><code>1 Insert:61
1 OrderedInsert:224
1 Insert:68 -> Make:43
1 Insert:68 -> HeapifyRight:90 -> RotateLeft:107
1 Insert:68
1 Insert:68 -> Insert:55 -> IsEmpty.get:51
</code></pre>
<p>This is a pretty small number of samples, but it suggests in the case of random input that Make (line 43) is consuming a higher fraction of time. That is this code:</p>
<pre><code> private Treap<T> Make(Treap<T> left, T value, Treap<T> right, int priority)
{
return new Treap<T>(Comparer, left, value, right, priority);
}
</code></pre>
<p>I then took 20 stackshots of the Random Insert code to get a better idea of what it was doing:</p>
<pre><code>1 Insert:61
4 Insert:64
3 Insert:68
2 Insert:68 -> Make:43
1 Insert:64 -> Make:43
1 Insert:68 -> Insert:57 -> Make:48 -> Make:43
2 Insert:68 -> Insert:55
1 Insert:64 -> Insert:55
1 Insert:64 -> HeapifyLeft:81 -> RotateRight:150
1 Insert:64 -> Make:43 -> Treap:35
1 Insert:68 -> HeapifyRight:90 -> RotateLeft:107 -> IsEmpty.get:51
1 Insert:68 -> HeapifyRight:88
1 Insert:61 -> AnonymousMethod:214
</code></pre>
<p>This reveals some information.<br />
25% of time is spent in line Make:43 or its callees.<br />
15% of time is spent in that line, not in a recognized routine, in other words, in <code>new</code> making a new node.<br />
90% of time is spent in lines Insert:64 and 68 (which call Make and heapify.<br />
10% of time is spent in RotateLeft and Right.<br />
15% of time is spent in Heapify or its callees. </p>
<p>I also did a fair amount of single-stepping (at the source level), and came to the suspicion that, since the tree is immutable, it spends a lot of time making new nodes because it doesn't want to change old ones. Then the old ones are garbage collected because nobody refers to them anymore.</p>
<p>This has got to be inefficient.</p>
<p>I'm still not answering your question of why inserting ordered numbers is faster than randomly generated numbers, but it doesn't really surprise me, because the tree is immutable.</p>
<p>I don't think you can expect any performance reasoning about tree algorithms to carry over easily to immutable trees, because the slightest change deep in the tree causes it to be rebuilt on the way back out, at a high cost in <code>new</code>-ing and garbage collection.</p>
<br /><br /><br /><h3>回答6:</h3><br /><p>@Guge is right.
However there is a little tiny bit more to it.
I am not saying that it is the biggest factor in this case - however it is there and it is hard to do anything about it.</p>
<p>For a sorted input, lookups likely touch the nodes that are hot in the cache.
(This is true in general for balanced trees like AVL trees, red-black trees, B-trees, etc.)</p>
<p>Since inserts start with a lookup, this has an effect on insert/delete performance as well. </p>
<p>Again, I am not claiming that it is the most significant factor in every and all cases.
It is there, however, and will likely result in sorted inputs being always faster than random ones in these data structures.</p>
<br /><br /><br /><h3>回答7:</h3><br /><p>Aaronaught has done a really decent job explaining this.</p>
<p>For these two special cases, I find it easier to grasp it in terms of the insertion path lengths. </p>
<p>For random input, your insertion path goes down to one of the leaves and the length of the path - thus the number of rotations - are bounded by the height of the tree.</p>
<p>In the sorted case, you walk on the right spine of the treap and the bound is the length of the spine, which is less than or equal to the the height. </p>
<p>Since you rotate nodes along the insertion path and your insertion path is the spine in this case, these rotations will always shorten the spine (which will result in a shorter insertion path at the next insertion, since the insertion path is just the spine etc.)</p>
<p>Edit: for the random case the insertion path is 1.75x longer.</p>
<br /><br /><br /><h3>回答8:</h3><br /><p>Try this: database on treap.</p>
<p>http://code.google.com/p/treapdb/</p>
<br /><br /><p>来源：<code>https://stackoverflow.com/questions/2437733/why-is-insertion-into-my-tree-faster-on-sorted-input-than-random-input</code></p></div>
<div class="field field--name-field-tags field--type-entity-reference field--label-above">
<div class="field--label">标签</div>
<div class="field--items">
<div class="field--item"><a href="https://www.e-learn.cn/tag/c" hreflang="zh-hans">c#</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/performance" hreflang="zh-hans">performance</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/data-structures" hreflang="zh-hans">data-structures</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/treap" hreflang="zh-hans">treap</a></div>
</div>
</div>
Wed, 04 Dec 2019 00:11:15 +0000旧城冷巷雨未停1542011 at https://www.e-learn.cnTreap with implicit keys
https://www.e-learn.cn/topic/336238
<span>Treap with implicit keys</span>
<span><span lang="" about="https://www.e-learn.cn/user/165" typeof="schema:Person" property="schema:name" datatype="" xml:lang="">自作多情</span></span>
<span>2019-11-27 14:57:22</span>
<div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>There's a data structure called treap: that's a randomized binary search tree, which is also a heap on randomly generated so-called "priorities".</p>
<p>There's a variation of this structure, where keys are implicit, they aren't stored in the tree, but we consider the ordered index of the node in the tree as this node's key. We need to store size of subtree in each node instead of key. This technique enables us to think about treap like some kind of array, which supports lots of operation in O(log N) time: insertion, deletion, reversion of subarray, changing on interval and so on.</p>
<p>I know a bit about this structure, but no so much. I tried to google it, but I've found only lots of articles about treap itself, but nothing about this "implicit treap" / "indexed list". I even don't know its name, because my native language isn't English and lecture I've listened used the native term of structure, not English original term. This native term can be directly translated in English as "Treap on the implicit keys" or "Cartesian tree on the implicit keys".</p>
<p>Can anybody point me at the article about this structure or tell me its original name? Thank you.</p>
<p>P.S. Sorry if my English wasn't understandable enough.</p>
<p><strong>UPD:</strong> Some extra explanation about structure I'm looking for.</p>
<p>Consider a usual treap with randomly chosen priorities and keys, which are actual user data stored in the tree. Then let's imagine we have some other user info stored in every node, and keys are nothing but the search keys. Next step is calculating and maintaining the subtree size in each node: we have to update this parameter after every Merge/Split/Add/Remove, but it allows us to find, for example, Kth element of the tree in O(log N) time.</p>
<p>When we have subtree sizes in each node, we can throw keys away and imagine that treap represents an array of user data in inorder traversal. Array index of each element can be easily calculated from subtree sizes. Now we can add/remove an element in the middle of array or split this array - all in O(log N) time.</p>
<p>We can also make "multiple" operation - for example, add a constant value to all elements of our "array". To implement this, we have to make this operation delayed, add a parameter in every node that represents a delayed constant which has to be "later" added to all the elements of this node's subarray, and "push" the changes up to down as necessary. Adding a constant to subarray or painting (marking) the subarray can be made delayed in this way, as reversing the subarray (here the delayed info in the node in the bit "subarray has to be reversed"), and so on.</p>
<p><strong>UPD2:</strong> Here's code snippet - piece of the small amount of information I've found. Don't notice cyrillic :) Words "с неявным ключом" mean in direct translation "with implicit key".</p>
<br /><h3>回答1:</h3><br /><p>You can find this data structure in the paper by Kaplan and Verbin on sorting signed permutations by reversals (page 7, section 3.1): http://www.math.jussieu.fr/~fouquet/ENSEIGNEMENT/PROJETS/kaplan.pdf.</p>
<br /><br /><br /><h3>回答2:</h3><br /><p>The key idea (no pun intended!) in treaps is to use keys, which are randomized. If you remove the keys, I don't see how you can have a treap: so perhaps I misunderstood your question. Or perhaps you are referring to the alternative to treaps, the <em>randomized binary search tree</em>. Both data structures use the same idea that you can attain average-case complexity by making sure your tree looks like an average tree (instead of a pathological case).</p>
<p>With the treaps, you do this using random priorities and balancing.</p>
<p>With randomized binary trees, the randomness is solely included during the construction: that is, when you insert a node in tree T, it has probability 1/(size(T) + 1) to be at the root, where size(T) is the number of nodes of T; and of course if the node is not inserted at the root, you continue recursively until it is added. (See articles my C. Martinez for a detailed study of these trees.)</p>
<p>This data structure behaves exactly like a treap, but uses a different mechanism that does not require keys.</p>
<p>If this is not what you were looking for, perhaps you could share some additional information: did your lecturer mention anybody who might have worked on this structure, where did you here this lecturer and what his/your nationality. It might not seem like it, but knowing your native tongue could be an important clue as you can generally peg down algorithms/data structures to a specific country that originated it.</p>
<br /><br /><br /><h3>回答3:</h3><br /><p>Maybe you are looking for a Rope (complex form of string) modified to your needs for delayed operations. Interesting thing is that there is an open question regarding ropes right here right now.</p>
<br /><br /><br /><h3>回答4:</h3><br /><p>I don't think there is a name for that data structure since it is simply a combination of two orthogonal concepts. You could use implicit keys like this with just about any self-balancing tree data structure.</p>
<p>You might want to take a look at Scapegoat trees, since they use the subtree size already for rebalancing and do not require any per-node overhead. </p>
<br /><br /><p>来源：<code>https://stackoverflow.com/questions/3497875/treap-with-implicit-keys</code></p></div>
<div class="field field--name-field-tags field--type-entity-reference field--label-above">
<div class="field--label">标签</div>
<div class="field--items">
<div class="field--item"><a href="https://www.e-learn.cn/tag/algorithm" hreflang="zh-hans">algorithm</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/data-structures" hreflang="zh-hans">data-structures</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/binary-tree" hreflang="zh-hans">binary-tree</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/key" hreflang="zh-hans">key</a></div>
<div class="field--item"><a href="https://www.e-learn.cn/tag/treap" hreflang="zh-hans">treap</a></div>
</div>
</div>
Wed, 27 Nov 2019 06:57:22 +0000自作多情336238 at https://www.e-learn.cn