In what order should you insert a set of known keys into a B-Tree to get minimal height?

前端未结

关注

 5  1632

-上瘾入骨i 2020-12-23 17:31

Given a fixed number of keys or values(stored either in array or in some data structure) and order of b-tree, can we determine the sequence of inserting keys that would gene

5条回答

孤独总比滥情好 (楼主)

2020-12-23 18:15

So is there a particular way to determine sequence of insertion which would reduce space consumption?

Edit note: since the question was quite interesting, I try to improve my answer with a bit of Haskell.

Let k be the Knuth order of the B-Tree and list a list of keys

The minimization of space consumption has a trivial solution:

-- won't use point free notation to ease haskell newbies
trivial k list = concat $ reverse $ chunksOf (k-1) $ sort list

Such algorithm will efficiently produce a time-inefficient B-Tree, unbalanced on the left but with minimal space consumption.

A lot of non trivial solutions exist that are less efficient to produce but show better lookup performance (lower height/depth). As you know, it's all about trade-offs!

A simple algorithm that minimizes both the B-Tree depth and the space consumption (but it doesn't minimize lookup performance!), is the following

-- Sort the list in increasing order and call sortByBTreeSpaceConsumption 
-- with the result
smart k list = sortByBTreeSpaceConsumption k $ sort list

-- Sort list so that inserting in a B-Tree with Knuth order = k 
-- will produce a B-Tree  with minimal space consumption minimal depth 
-- (but not best performance)
sortByBTreeSpaceConsumption :: Ord a => Int -> [a] -> [a]
sortByBTreeSpaceConsumption _ [] = []
sortByBTreeSpaceConsumption k list
    | k - 1 >= numOfItems = list  -- this will be a leaf
    | otherwise = heads ++ tails ++ sortByBTreeSpaceConsumption k remainder
    where requiredLayers = minNumberOfLayersToArrange k list
          numOfItems = length list
          capacityOfInnerLayers = capacityOfBTree k $ requiredLayers - 1
          blockSize = capacityOfInnerLayers + 1 
          blocks = chunksOf blockSize balanced
          heads = map last blocks
          tails = concat $ map (sortByBTreeSpaceConsumption k . init) blocks
          balanced = take (numOfItems - (mod numOfItems blockSize)) list
          remainder = drop (numOfItems - (mod numOfItems blockSize)) list

-- Capacity of a layer n in a B-Tree with Knuth order = k
layerCapacity k 0 = k - 1
layerCapacity k n = k * layerCapacity k (n - 1)

-- Infinite list of capacities of layers in a B-Tree with Knuth order = k
capacitiesOfLayers k = map (layerCapacity k) [0..]

-- Capacity of a B-Tree with Knut order = k and l layers
capacityOfBTree k l = sum $ take l $ capacitiesOfLayers k

-- Infinite list of capacities of B-Trees with Knuth order = k 
-- as the number of layers increases
capacitiesOfBTree k = map (capacityOfBTree k) [1..]

-- compute the minimum number of layers in a B-Tree of Knuth order k 
-- required to store the items in list
minNumberOfLayersToArrange k list = 1 + f k
    where numOfItems = length list
          f = length . takeWhile (< numOfItems) . capacitiesOfBTree

With this smart function given a list = [21, 18, 16, 9, 12, 7, 6, 5, 1, 2] and a B-Tree with knuth order = 3 we should obtain [18, 5, 9, 1, 2, 6, 7, 12, 16, 21] with a resulting B-Tree like

              [18, 21]
             /
      [5 , 9]
     /   |   \
 [1,2] [6,7] [12, 16]

Obviously this is suboptimal from a performance point of view, but should be acceptable, since obtaining a better one (like the following) would be far more expensive (computationally and economically):

          [7 , 16]
         /   |   \
     [5,6] [9,12] [18, 21]
    /
[1,2]

If you want to run it, compile the previous code in a Main.hs file and compile it with ghc after prepending

import Data.List (sort)
import Data.List.Split
import System.Environment (getArgs)

main = do
    args <- getArgs
    let knuthOrder = read $ head args
    let keys = (map read $ tail args) :: [Int]
    putStr "smart: "
    putStrLn $ show $ smart knuthOrder keys
    putStr "trivial: "
    putStrLn $ show $ trivial knuthOrder keys

0 讨论(0)

查看其它5个回答