What is the storage capacity of a Mnesia database?

后端 未结 3 1397
隐瞒了意图╮
隐瞒了意图╮ 2020-12-07 15:56

Some places state 2GB period. Some places state it depends up the number of nodes.

3条回答
  •  一个人的身影
    2020-12-07 16:28

    TL;DR: the storage capacity of a Mnesia database is limited only* by available RAM.

    * Assuming you use table types ram_copies or disc_copies. Also, if you store a lot of data in a disc_copies table, it needs to be read from disk at startup, which might increase startup time beyond what's acceptable.


    This answer contradicts the two existing answers when it comes to tables of type disc_copies. Let me first get a few general points out of the way:

    • A mnesia table of type ram_copies is only limited by available RAM (except if you're on a 32-bit machine). Data is stored in an ETS table.
    • A mnesia table of type disc_only_copies is stored in a Dets table. Dets tables are limited to 2 GB, because of limits in the file format.
    • The obvious way to circumvent that limit is to create more tables, possibly through table fragmentation.
    • The schema is also stored in a Dets table, so the information describing all existing tables is also limited to 2 GB. You are likely to run into other limits before you hit that one, though.
    • A mnesia table of type disc_copies is stored both in RAM and on disk, so it is limited by available RAM - and perhaps something else?

    I'm going to try to show below that there is no specific limit imposed by Mnesia on the size of a disc_copies table. Note however that many Erlang programmers believe that disc_copies tables are limited to 2 GB. That is stated in the accepted answer to this question, which at the time of writing outscores this answer by a factor of 7.


    disc_copies moved from dets to disk_log in 2001

    It is commonly believed that disc_copies tables are backed by Dets tables. As far as I can tell, this was the case until Erlang/OTP R7B-4 (released on 30th September 2001). From the README:

      -- mnesia -----------------------------------------------------------------
    
            OTP-3712 - Speed/load improvements disc_copies tables are not 
                       implemented with dets anymore.
    

    Look at the diff for more details, in particular mnesia_lib.erl and mnesia_loader.erl.


    Sources supporting dets and a 2 / 4 GB limit

    archelaus's answer draws from http://erlang.org/~hakan/mnesia_consumption.txt, which explains that disc_copies tables reside in ets and dets tables. However, looking at the index for the directory, we see that this document is dated 1999:

    [TXT] mnesia_consumption.txt  26-Oct-1999 10:57    10k  
    

    It makes sense that it would say this, as it was written two years before the change.

    Ray Boosen's answer draws from the Erlang FAQ:

    11.5 How much data can be stored in Mnesia?

    Dets uses 32 bit integers for file offsets, so the largest possible mnesia table (for now) is 4Gb.

    In practice your machine will slow to a crawl way before you reach this limit.

    The FAQ has been saying that since at least January 2001 (see the earliest copy in the Wayback Machine). That means that this FAQ entry dates from before the switch to disk_log, and hasn't been updated for a long time. (Anyway, the Dets table size limit is 2 GB, not 4 GB.) I submitted a pull request for the FAQ.


    Sources supporting higher limits

    The Learn You Some Erlang chapter on Mnesia says:

    ram_copies
    This option makes it so all data is stored exclusively in ETS, so memory only. Memory should be limited to a theoretical 4GB (and practically around 3GB) for virtual machines compiled on 32 bits, but this limit is pushed further away on 64 bits virtual machines, assuming there is more than 4GB of memory available.

    disc_only_copies
    This option means that the data is stored only in DETS. Disc only, and as such the storage is limited to DETS' 2GB limit.

    disc_copies
    This option means that the data is stored both in ETS and on disk, so both memory and the hard disk. disc_copies tables are not limited by DETS limits, as Mnesia uses a complex system of transaction logs and checkpoints that allow to create a disk-based backup of the table in memory.

    I'm not sure when this was written, but the text above exists in the earliest Wayback Machine copy, dated April 2012.

    In a post on erlang-questions titled "beating mnesia to death (was RE: Using 4Gb of ram with Erlang VM)", dated 7th November 2005, Ulf Wiger writes:

    On a 16 GB machine, you can:

    • run 6 million simultaneous processes (through use of erlang:hibernate, I was actually able to run 20 million - spawn time: 6.3 us, message passing time: 5.3 us, and I had 1.8 GB to spare.)

    • populate mnesia with at least 12 GB of data, but think through how you want to represent it, since the 64-bit word size blows things up a bit.

    • keep a 10 GB+ disc_copy table in mnesia. The load times and log dump cost seem acceptable (10 minutes to load, dumping takes a while but runs in the background quite nicely.)

    Conclusions

    The confusion seems to stem from missing or out-dated information from official sources:

    • The Mnesia documentation doesn't mention any table size limits
    • The Erlang FAQ says that Mnesia is subject to a 4 GB Dets size limit, but this answer was written before the dets to disk_log change
    • The only other document on the erlang.org domain is Håkan Mattsson's document, dating from before the dets to disk_log change

    LYSE seems to be the first "authoritative" source that mentions disc_copies tables not being subject to the Dets table size limit.

提交回复
热议问题