Suppose you have a large file made up of a bunch of fixed size blocks. Each of these blocks contains some number of variable sized records. Each record must fit completely w
If there is no ordering to these records, I'd simply fill the blocks from the front with records extracted from the last block(s). This will minimize movement of data, is fairly simple, and should do a decent job of packing data tightly.
E.g.:
// records should be sorted by size in memory (probably in a balanced BST)
records = read last N blocks on disk;
foreach (block in blocks) // read from disk into memory
{
if (block.hasBeenReadFrom())
{
// we read from this into records already
// all remaining records are already in memory
writeAllToNewBlocks(records);
// this will leave some empty blocks on the disk that can either
// be eliminated programmatically or left alone and filled during
// normal operation
foreach (record in records)
{
record.eraseFromOriginalLocation();
}
break;
}
while(!block.full())
{
moveRecords = new Array; // list of records we've moved
size = block.availableSpace();
record = records.extractBestFit(size);
if (record == null)
{
break;
}
moveRecords.add(record);
block.add(record);
if (records.gettingLow())
{
records.readMoreFromDisk();
}
}
if(moveRecords.size() > 0)
{
block.writeBackToDisk();
foreach (record in moveRecords)
{
record.eraseFromOriginalLocation();
}
}
}
Update: I neglected to maintain the no-blocks-only-in-memory rule. I've updated the pseudocode to fix this. Also fixed a glitch in my loop condition.