Looking for a disk-based B+ tree implementation in C++ or C [closed]

点点圈 提交于 2019-11-28 04:32:29

http://people.csail.mit.edu/jaffer/WB.

You can also consider re-using the B-Tree implementations from an open source embeddable database. (BDB, SQLite etc)

Faircom's C-Tree Plus has been available commercially for over 20 years. Don't work for them etc... FairCom

There is also Berkley DB which was bought by Oracle but is still free from their site.

My own implementation is under http://www.die-schoens.de/prg License is Apache. Its disk-based, maps to shared memory where it also can do locking (i. e. multiuser), file format protects against crash etc. All of the above can be easily switched off (compile or runtime if you like). So bare bone would be almost ANSI-C, basically caching in ones own memory and not locking at all. Test program is included. Currently, it only deals with fixed-size fields, but I am working on that...

I second the suggestion for Berkeley DB. I used it before it was bought by Oracle. It is not a full relational database, but just stores key-value pairs. We switched to that after writing our own paging B-Tree implementation. It was a good learning experience, but we kept adding features until is was just a (poorly) implemented version of BDB.

If you want to do it yourself, here is an outline of what we did. We used mmap to map pages into memory. The structure of each page was index based, so with the page start address you could access any element on the page. Then we mapped and unmapped pages as necessary. We were indexing multi GB text files, back when 1 GB of main memory was considered a lot.

I' pretty sure it's not the solution you're looking but why don't you store the tree in a file yourself? All you need is an approach for serialization and an if/ofstream.

Basically you could serialize it like that: go to root, write '0' in your file a divider like '|', the number of elements in root and then all root elements. Repeat with '1' for level 1 and so on. As long as you don't change the level keep the level index, empty leafs could look like 2|0.

You could look at Berkeley DB, its supported ny Oracle but it is open source and can be found here.

RogueWave, the software company, have a nice implementation of BTreeOnDisk as part of their Tools++ product. I've been using it since late 90's. The nice thing about it is that you can have multiple trees in one file. But you do need a commercial license.

In their code they do make a reference to a book by a guy called 'Ammeraal' (see http://home.planet.nl/~ammeraal/algds.html , Ammeraal, L. (1996) Algorithms and Data Structures in C++). He seems to have an imlementation of a BTree on disk, and the source code seems to be accessible online. I have never used it though.

I currently work on projects for which I'd like to distribute the source code so I need to find an open source replacement for the Rogue Wave classes. Unfortunately I don't want to rely on GPL type licenses, otherwise a solution would be simply to use 'libdb' or equivalent. I need a BSD type license, and for a long time I couldn't find anything suitable. But I will have a look at some the links in earlier posts.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!