What is the benefit of having varbinary field in a separate 1-1 table?

ε祈祈猫儿з 提交于 2019-11-30 16:35:53

There is no performance nor operational advantage. Since SQL 2005 the LOB types are already stored for you by the engine in a separate allocation unit, a separate b-tree. If you study the Table and Index Organization of SQL Server you'll see that every partition has up to 3 allocation units: data, LOB and row-overflow:

A LOB field (varchar(max), nvarchar(max), varbinary(max), XML, CLR UDTs as well as the deprecated types text, ntext and image) will have in the data record itself, in the clustered index, only a very small footprint: a pointer into the LOB allocation unit, see Anatomy of a Record.

By storing a LOB explicitly in a separate table you gain absolutely nothing. You just add unneeded complexity as former atomic updates have to distribute themselves now into two separate tables, complicating the application and the application transaction structure.

If the LOB content is an entire file then perhaps you should consider upgrade to SQL 2008 and using FILESTREAM.

There is no real logical advantage to this two-tables design, since the relationship is 1-1, you might have all the info bundled in the FileInfo table. However, there are serious operational and performance advantages, in particular if your binary data is more than a few hundred bytes in size, on average.

EDIT: As pointed out by Remus Rusanu, on some DBMS implementations such as SQL2005, the large object types are transparently stored to a separate table, effectively alleviating the practical drawback of having big records. The introduction of this feature implicitly confirms the the [true] single table approach's weakness.

I merely scanned the SO posting referenced in this question. I generally thing that while that other posting makes a few valid points, such as intrinsic data integrity (since all CRUD actions on a given item are atomic), but on the whole, and unless of relatively atypical use cases (such as using the item table as a repository mostly queried for single items at a time), the performance advantage is with the two tables approach (whereby indexes on "header" table will be more efficient, queries that do not require the binary data will return much more quickly etc. etc.)

And the two tables approach has further benefits in case the design evolves to supply different types of binary objects in differnt context. For example, say these items are images (GIFs, JPGs etc.). At a later date you want to also provide a small preview version of these images (and/or a hi-resolution version), the choice of this being driven by the context (user preference, low band-width clients, subscriber vs. visitor etc.). In such a case not only are the operational issues associated with the single table approach made more acute, the model becomes more versatile.

It can help to separate IMAGE, (N)TEXT, (N)VARCHAR(max) and VARBINARY(max) columns out of wider tables purely for some restrictions of SQL Server.

For example before 2012 it was not possible to online rebuild a clustered table if it contained LOBs. On the other hand you might not care about those restrictions, so setting up the table like your data is related is the better thing to do.

In case you physically want to keep the LOB data out of the table allocation unit you still can set the "large value types out of row" table option.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!