I do not know any good learning resource about how to design XML document models (schemas are just a formal way of specifying document models).
In my opinion, one crucial insight to XML is that it is not a language: it is a syntax. And each document model is a separate language.
Different cultures will each use XML in their own special way. Even within W3C specifications you can smell Lisp in dash-separated-names of XSLT, and Java in the camelCaseNames of XML Schema. Similarly, different application domains will call for different XML idioms.
Narrative document models such as HTML or DocBook tend to put printable text in text nodes and metadata in element names and attributes.
More object-oriented document models such as SVG make little or no use of text nodes and instead only use elements and attributes.
My personal rules of thumb for document model design go something like this:
- If it is the sort of the free-from tag soup that requires mixed content, use HTML and DocBook as sources of inspiration. The other rules are only relevant otherwise.
- If a value is going to be composite or hierarchical, use elements. XML data should require no further parsing, except for established idioms such as IDREFS which are simple space-separated sequences.
- If a value may need to occur more than once, use elements.
- If a value may need to be refined further, or enriched later, use elements.
- If a value is clearly atomic (boolean, number, date, identifier, simple label), and may occur at most once, then use an attribute.
Another way to say it could be:
- If it's narrative, it's not object oriented.
- If it's object oriented, model objects as elements and atomic attributes as attributes.
EDIT: Some people seem to like to entirely forgo attributes. There's nothing wrong with it, but I dislike it as it bloats documents and make them unnecessary hard to read and write by hand.