问题
I am trying to understand BSON
via http://bsonspec.org/#/specification, but still some questions remain.
let's take an example from the web site above:
{"hello": "world"} → "\x16\x00\x00\x00\x02hello\x00\x06\x00\x00\x00world\x00\x00"
Question 1
in the above example, for the encoded bytes results, the double quotes actually are not part of the results, right?
Question 2
I understand that the first 4 bytes
\x16\x00\x00\x00
is the size of the whole BSON doc.
And it is little endian
format. But why? Why not take big endian
?
Question 3
How comes the size of the example doc being \x16
, i.e. 22
?
Question 4
Normally, if I want to encode the doc by myself, how do I calculate the size of the doc? I think my trouble majorly is how to decide the size of UTF-8
string?
Let's take another example:
{"BSON": ["awesome", 5.05, 1986]}
→
"\x31\x00\x00\x00\x04BSON\x00\x26\x00\x00\x00\x020\x00\x08\x00\x00
\x00awesome\x00\x011\x00\x33\x33\x33\x33\x33\x33\x14\x40\x102\x00\xc2\x07\x00\x00
\x00\x00"
Question 5
In this example, there is an array. according to the specification, for array, it is actually a list of {key, value}
pairs, whereas the key is 0
, 1
, etc. My question is so the 0
, 1
here are strings too, right?
回答1:
Question 1
in the above example, for the encoded bytes results, the double quotes actually are not part of the results, right?
The quotes are not part of the strings. They're used to mark JSON strings
Question 2
And it is little endian format. But why? Why not take big endian?
Choice of endianness is largely a matter of preference. One advantage of little endian is that commonly used platforms are little endian, and thus don't need to reverse the bytes.
Question 3
How comes the size of the example doc being \x16, i.e. 22?
There are 22 bytes (including the length prefix)
Question 4
Normally, if I want to encode the doc by myself, how do I calculate the size of the doc? I think my trouble majorly is how to decide the size of UTF-8 string?
First write out the document, and then go back to fill in the length.
Question 5
n this example, there is an array. according to the specification, for array, it is actually a list of {key, value} pairs, whereas the key is 0, 1, etc. My question is so the 0, 1 here are strings too, right?
Yes. Zero terminated strings without length prefix to be exact. (Called cstring
in the list). Just like an embedded document.
来源:https://stackoverflow.com/questions/16169879/can-i-get-more-explanations-for-bson