Database Format

From FreeM Wiki
Revision as of 10:40, 29 March 2025 by Smw (talk | contribs)
Jump to navigation Jump to search

From David Wicksell

The first block in my global is a bottom level pointer. Maybe that's what a ROOT block always is? But there is no data in any other block, as it's a brand new global. Confusing.
Oh, I take that back. There is another block!
So with a global with even just one small record in it, there will be more than one block.
I suspect bottom pointer is called that because it's abstracted to be an upside down tree, and it's the root block for the global, which is the "bottom" of the global tree.
I still don't quite understand this format though.
That's block 0 of ^dlw. The 1, 3, and 1 on the top line appear to be pointers to another block, block 1.
[4:33 PM]
The end of that block has the metadata about the type of block it is, but the front of this block has at least one pointer to block number 1.
Yep, that is definitive now. Block 1, or the second block, contains my first key, the only key, in the global, so far.
[4:37 PM]
I will figure this out.
Ok, I see where key compression is done.
[4:46 PM]
I almost have the first block structure figured out.
[4:51 PM]
Ok, if the key you are going to insert isn't in the data block where it sorts to, then FreeM will lseek back in to the ROOT, and add the pointer first. So it has to traverse the path twice.
Ok, that's strange. The key format for canonical numbers is odd. It turns a subscript of (2) in to a 5 in the key.
[4:58 PM]
I believe it also turns a (0) in to a 3 and a (1) in to a 4. I guess it can't use ASCII 0, 1, or 2 for that. (edited)
[5:01 PM]
I can definitelly see ways to speed up globals in FreeM! For one thing, it recalculates the file path to the global over and over, even when it has the file descriptor stored already.
Ok, so I have the ROOT block, which is a bottom pointer block figured out, format wise.
[5:21 PM]
[5:24 PM]
Starting at the first byte of the block, the 1 is the length of the first key in the global. Then it skips a byte, and the 3 is the first key. Then, because the block is a pointer block, all keys are hard-coded with a 3 byte pointer to their data block. So the next three bytes, 00 00 01 are a pointer to the block containing key 3. Now key 3 is actually ^dlw(1), but that 1 is encoded in to a compact key form, and ends up as 3. I'm not 100% sure why yet. So that points to block 1. I'm inserting a new pointer for ^dlw(2) right now, so in a few minutes, this first block will have another key up on the top line, and I'll show that to you once it's there.
[5:25 PM]
FreeM also maintains a path through the blocks in a global, just like RSM does and just like GT.M/YottaDB do.
[5:28 PM]
In RSM, that path is recorded in an array of pointers to global buffer descriptors in shared memory, called blk, it is 12 pointers long. In FreeM, the path seems to be in 2 arrays of longs, the first traceblk stores the block numbers in the path, and the second, traceadr also stores numbers, but shorts instead of unsigned longs. So far, I haven't figured out what it stores semantically. Both hold 32 numbers, so that would support the idea that it can handle a B-tree of 32 levels, which is gigantic
dlw — Yesterday at 6:56 PM
Ok, before inserting the second key, here is what block 0, the ROOT, a BOTTOM pointer for ^dlw looks like:
"\001\000\003\000\000\001", '\000' <repeats 1010 times>, "\001\000\000\000\000\006\000\006"
You can see stuff at the beginning and the end. Starting at the beginning, is 001 then 000, which added together is 1 (because of key compression), the length of the first key. Then the key is 003 (which is the compact form of (1) I guess). Then, since it's a BOTTOM block, which is a pointer, and pointers are always 3 bytes, the next 3 bytes are 000 000 001, which when calculated is a pointer to block 1.

Then, looking at the end of the block, the third to last byte is 006, which is the block type, BOTTOM. Then the last two bytes are 000 006, which, when calculated is an offset in to the block itself, of 6, meaning that it's the beginning of the free space in the block, offset from the beginning of the block. Then, to the left of the 006 type, is 3 bytes of 000 000 000, which would be the right pointer in a data block (maybe a pointer block too), but in the ROOT block, it's the number of free blocks in the global, which is 0, since there are only 2 blocks right now, and neither are empty. Then, one byte to the left of that is 000, which is the collation for the block, and a 0 is just M collation. Then, the three bytes before that are 000 000 001, and if this were a data block (maybe a pointer block too), it would be the left pointer, but since this is the ROOT block, it's the number of total blocks in the global, in this case 1. Since there is the ROOT block and a data block, the total block count must not include the ROOT block, but is how many more blocks.

Next we'll look at the second block, a data block.
dlw — Yesterday at 7:04 PM
Ok, before inserting the second key, here is what block 1, the DATA block, for ^dlw looks like:
"\001\000\003\0011", '\000' <repeats 1016 times>, "\b\000\005"

You can see stuff at the beginning and the end. Starting at the beginning, is 001 then 000, which added together is 1 (because of key compression), the length of the first key. Then the key is 003 (which is the compact form of (1) I guess). So far, identical to the ROOT, but once more keys are there, enough to need more blocks, then that will likely change. Then, since it's a DATA block, which is not a pointer, the next byte is the length of the data record at the key, which is a length of 1. Then, the next byte is 1 (ASCII 49, since the compact form is only in keys, and this is data), the data at ^dlw(1).

Then, looking at the end of the block, the third to last byte is \b (which is 008), which is the block type, DATA. Then the lsat two bytes are 000 005, which, when calculated is an offset in to the block itself, of 5, meaning that it's the beginning of the free space in the block, offset from the beginning of the block. Then, to the left of the 008 type, is 3 bytes of 000 000 000, which would be the number of free blocks in the ROOT, but since this is DATA, is the right pointer. There are no blocks to the right, so it's 0. The, one byte to the left of that is 000, which in a DATA block is a reserved byte, and always 0. You could do something with that in the future. Then, the three bytes before that are 000 000 000, which would be the number of total blocks in the ROOT, but since this is DATA, is the left pointer. There are not blocks to the left, so it's 0.
[7:06 PM]
So that's what I've worked out so far. One observation, currently a key length is stored in two bytes, a byte for the length, and one for how much is compressed. So a key in a global in FreeM is restricted to 255 bytes without changing the format. The more important observatoin is that the length of a data record, or node, is stored in a single byte, also limiting it to 255 bytes without changing the format. Since I know you want to raise your string limits, you'll have to look at that first. I can help you figure out where to go and what to do if you like.

dlw — Yesterday at 7:11 PM
By the way, I'm doing this for two reasons. First, I love this sort of code, so I'm enjoying it. Second, and more importantly, I know you hate it, so I'm trying to help you out, because I want you to succeed at increasing the string length in FreeM.

dlw — Yesterday at 7:55 PM
Ok, I understand the FreeM block structure much better now.
Ok, some weird things happen, they may or may not matter. I'll have to look in to that. But no block joins happen at all. Rather, most of a block that has all of its keys removed stays there empty.
[12:37 AM]
I had a global with 1 pointer block and 3 data blocks, and I killed all the records in the first and third data block, and for the most part, they are just sitting there all empty, while the second block is still full. The weirdness is that they both have one record still in them, and I'm not sure why. It's the last key, and it appears to have been moved to the front of the block. So maybe a FOR loop in FreeM is not inclusive in its END condtion, but exclusive. Or maybe something else is going on. Now I'm going to remove keys from the middle of the second block to see what happens.
[12:41 AM]
Blocks in FreeM do not contain their block number.
[12:41 AM]
I guess I just assumed they would.
Ok, so that weirdness seems to just be an artifact, and I know why.

dlw — 1:06 AM
I was removing records in a loop, and when it got to the last record in a block, it didn't bother to remove it, instead it left it there, and just marked the block as empty. So it definitely does not do joins, it will remove records, and each time it does, it will re-key the block so that the keys after the one that was removed, are moved up to fill in the empty spot. Once it gets to that last record, it just marks the block free and leaves the record there, as the first record in the block.

One thing I don't quite understand is that I created records to fill 3 data blocks, so there was 1 bottom pointer and 3 data blocks, but after all of that, what was left was block 0 was the bottom pointer, it has one pointer to block 2. Block 1 is marked empty and has one record in it, the original last record, as according to my last paragraph. Block 2 which I removed like 50 records from the middle of the block, and the records after it were moved up to fill the hole. Block 3 is marked empty and has one record in it at the beginning, the original last record, as according to my last paragraph. It also has a bunch of records at the end, but it's marked empty, so that seems fine. But why did leave them there, and why don't they match what I'd expect? They have records that were originally in block 3, and some of them are still there in block 3. So that part is a mystery, based on the code I ran, and the state of the blocks at each point. Block 5 is really weird. It was never used for data. But it has a few bits of things in it, and is marked as something I have no clue about, an FBLK. So not an EMPTY, but an FBLK. It is used though, as it has an offset and something before that offset and has that FBLK marker. So I'll have to figure out what an FBLK is and what it's used for. It's the one block type I don't know yet.
[1:07 AM]
Long story short, FreeM will always fill in holes during record deletion, by moving later records up to fill them, and it does not do block joins, it will just mark the block as empty. Both are things RSM does not do.
[1:10 AM]
Ohhhh!
[1:10 AM]
I know what it is!
[1:10 AM]
Clever!
[1:12 AM]
It's the free block list! So once FreeM has freed a block that had data in it, it creates a new block called the free block (FBLK), which is just a list of pointers to free blocks in order. So in my testing data, it just has a pointer to block 1 and block 3, which I told you above are both free. Does it always maintain that block at the end of the global somehow, or does it have another way of knowing where to find it? I'll have to figure that out.

dlw — 1:15 AM
Ooo, that's how!
[1:17 AM]
Remember earlier I told you that the ROOT block, always a BOTTOM pointer, didn't have a left and right pointer, since it has no siblings, so instead it uses those fields to store the number of blocks other than the ROOT that are in the global, and the number of free blocks? Yeah, it's not the number of free blocks, it's the block number that stores the FBLK, or free block list! So it can always find that block when it needs a new block, before it would add a new block to the global. So instead of a map block using a bitmap, like RSM uses, it has a pointer from the ROOT block to a FBLK, which maintains the list of free blocks!
[1:18 AM]
So there you go, other than a few details on internal key formats, I have taught you pretty much the entire globals module tonight! I have not dug in to block splits, but I'll do that soon, and let you know any details.
[1:21 AM]
So, remember that because all of the block pointers are 3 bytes, that means a global in FreeM can be at most, 16 MiB blocks, and at 1 KiB per block, that means no larger than 16 GiB.
[1:22 AM]
Just an FYI or a reminder.