I\'m a bit old school when it comes to database design, so I\'m totally for using the correct data sizes in columns. However, when reviewing a database for a friend, I notic
The diff is in next:
VARCHAR(X)
can be indexed and stored in the MDF/NDF
data file.
VARCHAR(MAX)
can't be indexed because can reach high volume and then will be stored as a seperated file and not in the MDF/NDF
data file.
They should NOT be used unless you expect large amounts of data and here is the reason why (directly from Books Online):
Columns that are of the large object (LOB) data types ntext, text, varchar(max), nvarchar(max), varbinary(max), xml, or image cannot be specified as key columns for an index.
If you want to cripple performance, use nvarchar for everything.
My answer to this, isn't about the usage of Max, as much as it is about the reason for VARCHAR(max) vs TEXT.
In my book; first of all, Unless you can be absolutely certain that you'll never encode anything but english text and people won't refer to names of foreign locations, then you should use NVARCHAR or NTEXT.
Secondly, it's what the fields allow you to do.
TEXT is hard to update in comparison to VARCHAR, but you get the advantage of Full Text Indexing and lots of clever things.
On the other hand, VARCHAR(MAX) has some ambiguity, if the size of the cell is < 8000 chars, it will be treated as Row data. If it's greater, it will be treated as a LOB for storage purposes. Because you can't know this without querying RBAR, this may have optimization strategies for places where you need to be sure about your data and how many reads it costs.
Otherwise, if your usage is relatively mundane and you don't expect to have problems with the size of data (IE you're using .Net and therefore don't have to be concerned about the size of your string/char* objects) then using VARCHAR(max) is fine.
I don't know how sql server handles large (declared) varchar fields from a performance, memory and storage perspective.. but assuming it does so as efficiently as smaller declared varchar fields, there's still the benefit of integrity constraints.
The application sitting on the db is supposed to have limits on the input, but the database can properly report an error if the application has a bug in this respect.
It is somewhat old-fashioned to believe that the application will only pass short strings to the database, and that will make it okay.
In modern times, you HAVE to anticipate that the database will be accessed primarily by the current application, but there may be a future version of the application, (will the developer of that version know to keep strings below a certain length?)
You MUST anticipate that web services, ETL processes, LYNC to SQL, and any other number of already existing, and/or not-yet-existing technologies will be used to access your database.
Generally speaking I try not to go over varchar(4000), because it's four-thousand characters, after all. If I exceed that, then I look to other datatypes to store whatever it is I am trying to store. Brent Ozar has written some pretty great stuff on this.
All that said, it is important to evaluate the current design's approach to your current requirements when you are working on a project. Have an idea of how the various parts work, understand the trade-offs of various approaches and solve the problem at hand. Exercising some great axiom can lead to blind adherence which might turn you into a lemming.
Redgate wrote a great article about this.
https://www.red-gate.com/simple-talk/sql/database-administration/whats-the-point-of-using-varcharn-anymore/
Conclusions