b-tree

In what order should you insert a set of known keys into a B-Tree to get minimal height?

一曲冷凌霜 提交于 2019-11-28 17:52:25
问题 Given a fixed number of keys or values(stored either in array or in some data structure) and order of b-tree, can we determine the sequence of inserting keys that would generate a space efficient b-tree. To illustrate, consider b-tree of order 3. Let the keys be {1,2,3,4,5,6,7}. Inserting elements into tree in the following order for(int i=1 ;i<8; ++i) { tree.push(i); } would create a tree like this 4 2 6 1 3 5 7 see http://en.wikipedia.org/wiki/B-tree But inserting elements in this way flag

What is a good open source B-tree implementation in C? [closed]

久未见 提交于 2019-11-28 17:00:54
I am looking for a lean and well constructed open source implementation of a B-tree library written in C. It needs to be under a non-GPL license so that it can be used in a commercial application. Ideally, this library supports the B-tree index to be stored/manipulated as a disk file so that large trees can be built using a configurable (ie: minimal) RAM footprint. Note: Since there seemed to be some confusion, a Binary Tree and a B-Tree are not the same thing. Paul Check out QDBM: http://fallabs.com/qdbm/ . It's LGPL (can be used in commercial app), implements a disk backed hash and/or B+

Are POSIX' read() and write() system calls atomic?

我怕爱的太早我们不能终老 提交于 2019-11-28 11:43:35
I am trying to implement a database index based on the data structure (B link tree) and algorithms suggested by Lehman and Yao in this paper . In page 2, the authors state that: The disk is partitioned in sections of fixed size (physical pages; in this paper, these correspond to the nodes of the tree). These are the only units that can be read or written by a process. [emphasis mine] (...) (...) a process is allowed to lock and unlock a disk page. This lock gives that process exclusive modification rights to that page; also, a process must have a page locked in order to modify that page. (...)

B Tree

泪湿孤枕 提交于 2019-11-28 07:59:06
目前大部分数据库系统及文件系统都采用B-Tree或其变种B+Tree作为索引结构,在本文的下一节会结合存储器原理及计算机存取原理讨论为什么B-Tree和B+Tree在被如此广泛用于索引,这一节先单纯从数据结构角度描述它们。 B-Tree 为了描述B-Tree,首先定义一条数据记录为一个二元组[key, data],key为记录的键值,对于不同数据记录,key是互不相同的;data为数据记录除key外的数据。那么B-Tree是满足下列条件的数据结构: d为大于1的一个正整数,称为B-Tree的度。 h为一个正整数,称为B-Tree的高度。 每个非叶子节点由n-1个key和n个指针组成,其中d<=n<=2d。 每个叶子节点最少包含一个key和两个指针,最多包含2d-1个key和2d个指针,叶节点的指针均为null 。 所有叶节点具有相同的深度,等于树高h。 key和指针互相间隔,节点两端是指针。 一个节点中的key从左到右非递减排列。 所有节点组成树结构。 每个指针要么为null,要么指向另外一个节点。 如果某个指针在节点node最左边且不为null,则其指向节点的所有key小于v(key1),其中v(key1)为node的第一个key的值。 如果某个指针在节点node最右边且不为null,则其指向节点的所有key大于v(keym),其中v(keym)为node的最后一个key的值。

Best data structure for crossword puzzle search

倖福魔咒の 提交于 2019-11-28 07:47:28
I have a large database for solving crossword puzzles, consisting of a word and a description. My application allows searching for words of a specific length and characters on specific positions (this is done the hard way ... go through all words and check each). Plus a search by description (if necessary) For instance find word _ _ A _ _ B (6 letter word, third character A and last B) I would like to index the words in such way that the searching would be really fast. My first idea was to use a balanced tree structure, any other suggestion? Okay, I am going to propose something weird, but

索引的B-tree结构

醉酒当歌 提交于 2019-11-28 07:21:26
索引的数据结构 索引常用的数据结构为 1.Hash 2.B - tree(B 树) 这两种数据结构是mysql存储索引所采用的数据结构。其中B- tree是mysql比较常用的数据结构 B - tree结构如下: B - tree是由节点和边组成的,且一个节点存储多个关键字,且关键字也会对应记录地址,节点结构如下: 如果查询姓 和名字,则会先查姓,如果这个姓是在子节点中就会进入子节点。因为左原则,查询名字是无法使用索引的,假如名字是“有为”,则可能会出现 康有为 李有为 将有为····等等很多,所以名是无法作为索引的。 来源: https://blog.csdn.net/shanjairui/article/details/100023460

Looking for a disk-based B+ tree implementation in C++ or C [closed]

点点圈 提交于 2019-11-28 04:32:29
I am looking for a lightweight open source paging B+ tree implementation that uses a disk file for storing the tree. So far I have found only memory-based implementations , or something that has dependency on QT (?!) and does not even compile. Modern C++ is preferred, but C will do too. I prefer to avoid full embeddable DBMS solution, because: 1) for my needs bare bone index that can use the simplest possible disk file organization is enough, no need for concurrency, atomicity and everything else. 2) I am using this to prototype my own index, and most likely will change some of the algorithms

Algorithm to find k-th key in a B-tree?

大兔子大兔子 提交于 2019-11-28 02:24:38
问题 I'm trying to understand how I should think about getting the k-th key/element in a B-tree. Even if it's steps instead of code, it will still help a lot. Thanks Edit: To clear up, I'm asking for the k-th smallest key in the B-tree. 回答1: There's no efficient way to do it using a standard B-tree. Broadly speaking, I see 2 options: Convert the B-tree to an order statistic tree to allow for this operation in O(log n). That is, for each node, keep a variable representing the size (number of

Are there any tools to estimate index size in MongoDB?

只谈情不闲聊 提交于 2019-11-27 19:32:37
I'm looking for a tool to get a decent estimate of how large a MongoDB index will be based on a few signals like: How many documents in my collection The size of the indexed field(s) The size of the _id I'm using if not ObjectId Geo/Non-geo Has anyone stumbled across something like this? I can imagine it would be extremely useful given Mongo's performance degradation once it hits the memory wall and documents start getting paged out to disk. If I have a functioning database and want to add another index, the only way I'll know if it will be too big is to actually add it. It wouldn't need to be

十一、MySQL中的索引原理 -系统的撸一遍MySQL

心已入冬 提交于 2019-11-27 17:39:57
MySQL支持的索引类型 MySQL支持多种索引类型,每一个存储引擎对其有着不同程度的支持。 MySQL支持以下四种索引,具体支持情况见下表: 索引 MyISAM InnoDB Memory B-Tree 支持 支持 支持 HASH 不支持 不支持 支持 R-Tree 支持 不支持 不支持 Full-Text 支持 不支持 不支持 B-Tree 算是平时用到最多的一种索引类型,大部分引擎都支持。 Hash 目前只有Memory支持,对于等值查询有着较高的效率,不支持范围查询。 R-tree 空间索引,在MyISAM中实现,用于地理位置的查询。 Full-Text 全文索引,紧在MyISAM中支持,MySQL5.6后InnoDB也开始支持。 支持的索引规则 1、最左前缀规则,使用a、b、c联合索引的时候,需要按从左到右的顺序进行SQL的查询,可以包含一部分字段,比如只根据a进行检索也可以使用索引,但是不可以颠倒顺序进行查询。 2、只查询索引中的字段,比如: //简单的登陆SQL,假设username为唯一索引,id为主键 SELECT * FROM user WHERE username = 'fuckphp' AND password = 'fuckphp'; //如上SQL的效率会比下面的效率低 SELECT id FROM user WHERE username =