I was designated to write a bunch of XML schemas to integrate my company systems with our clients. I've designed a dozen of them more than 10 years ago and saw that a lot of extension features in the specification didn't work well in practice. Before designing the new ones, I've searched for the current best practices (and arrived here!).
Some of the tips above are useful, but I didn't like almost all references. The best place with design recommendations that I found were from Microsoft.
The best reference is XML Schema Design Patterns: Avoiding Complexity. Here you will find this sane advice:
it seems that many schema authors would be best served by
understanding and utilizing an effective subset of the features
provided by W3C XML Schema instead of attempting to comprehend all of
the esoteric and minutiae of the language.
and give detailed explanations of the following guidelines:
- Why you should use global and local element declarations
- Why you should use global and local attribute declarations
- Why you should understand how XML namespaces affect the W3C XML Schema
- Why you should always set elementFormDefault to "qualified"
- Why you should use attribute groups
- Why you should use model groups
- Why you should use the built-in simple types
- Why you should use complex types
- Why you should not use notation declarations
- Why you should use substitution groups carefully
- Why you should favor key/keyref/unique over ID/IDREF for identity constraints
- Why you should use chameleon schemas carefully
- Why you should not use default or fixed values especially for types of xs:QName
- Why you should use restriction and extension of simple types
- Why you should use extension of complex types
- Why you should use restriction of complex types carefully
- Why you should use abstract types carefully
- Do use wildcards to provide well-defined points of extensibility
- Do not use group or type redefinition
My advice about their advice is that when they say "use carefully", you should simply avoid it. My impression is that the Schema specification were not written by software developers. They tried to use some Object Orientation concepts but did a mess of it. A lot of the extension mechanisms are useless or extremely verbose. I don't really understand how someone could have invented the restriction of complex types.
Two more nice articles in this site are:
- Schema Design Patterns: Dealing With Change
- XML Schema Design Patterns: Is Complex Type Derivation Unnecessary?
And one tip that is pervasive is to specify your schemas with something different than the official specification. Relax NG looks the most favored specification language. Unfortunately you will loose one of the best features of it that is the standardization.