Is it “bad” to store XML in a database?

后端 未结 7 1571
感动是毒
感动是毒 2021-02-02 11:06

I have heard from several sources that storing XML in a database is \"bad\", but I have never seen/heard an actual explanation of why that is. Is it true? If it is true, can y

7条回答
  •  暗喜
    暗喜 (楼主)
    2021-02-02 12:11

    There are some really stupid answers here - just because a database supports a data type does not mean you should be using it. These things are invariably added in as features because the competition have them, not because they are the right thing to do. Global variables? Triggers? Would anyone like to defend them too just because you can use them and they're there?

    If you have multiple attributes, the best way to handle them in a relational database is with a one to many relationship. Parse out your useful data from the XML overhead. You then just store the ID (primary key) of the parent record with each of the rows stored in a second table, one row per attribute. You can have any number of attributes per parent record. It's database design 101, nothing clever. Storing it as unstructured XML just to store a variable number of attributes is not the way to go, it's a sledgehammer to crack a peanut. A one to many relationship between two tables is simpler, easier to understand, much faster to query, much less effort coding, and less storage (which means faster queries). Everyone wins, apart from the storage vendors.

    XML is a data transfer protocol; as GolezTrol rightly said, "It is a way to export (and import) data" - i.e.: it is simply an overhead used to facilitate the communication of the structure of the data between different systems. Once received, the tags should be stripped out and the data (and only the data) stored in your database engine of choice, whatever that might be. Not the XML itself. The overhead for XML is ~10x that of the data it's describing. Want to tell your boss why that 100GB of data is occupying 1TB of space on your hyper expensive SAN? Or taking all night to back up over a saturated network link? Or causing performance problems in production? If you don't parse out the data from the now pointless tags, you will just push the problem and ongoing, daily support costs onto operational support for the next ten years. Sloppy, sloppy, sloppy. This keeps vendors like EMC in business.

    XML is metadata. Nothing clever, just a schema descriptor. Once it's transferred and parsed it's lost its usefulness and is just clutter that clogs up whatever database you use. Get rid of it, unless you're compulsively addicted to hording yesterday's pointless crappy description metadata, stored many times over. Wake up. It's typical "Emperor's New Clothes" syndrome, stopped being conned by something simple and disposable. It's only metadata and it should not be stored or worshipped, it's junk once it's parsed. And what's better? To parse it once, or to uselessly parse it every time you need data from it? The answer's pretty darned obvious to me.

提交回复
热议问题