I\'m new to database design and I have been reading quite a bit about normalization. If I had three tables: Accommodation, Train Stations and Airports. Would I have address
For addresses, I would almost always create a separate address table. Not only for normalization but also for consistency in fields stored.
As for such a thing as over-normalization, absolutely there is! It's hard to give you guidance on what is and isn't over-normalization as I think it mostly comes from experience. However, follow the books on each level of normalization and then once it starts to get difficult to see where things are you've probably gone too far.
Look at all the sample/example databases you can as well. They will give you a good indication on when you should be splitting out data and when you shouldn't.
Also, be well aware of the type and amount of data you're storing, along with the speed of access, etc. A lot of modern web software is going fully de-normalized for many performance and scalability reason. It's worth looking into those for reason why and when you should and shouldn't de-normalize.
If you are using Oracle 9i, you could store address objects in your tables. That would remove the (justified) concerns about address formats.
Database Normalization is all about constructing relations (tables) that maintain certain functional dependencies among the facts (columns) within the relation (table) and among the various relations (tables) making up the schema (database). Bit of a mouth-full, but that is what it is all about.
A Simple Guide to Five Normal Forms in Relational Database Theory is the classic reference for normal forms. This paper defines in simple terms what the essence of each normal form is and its significance with respect to database table design. This is a very good "touch-stone" reference.
To answer your specific question properly requires additional information. Some critical questions you have to ask are:
All this to say, the answer to your question is not as straight forward as one might hope for!
Is there such a thing as "over normalization"? Maybe. This depends on whether the functional dependencies you have identified and used to build your tables are of significance to your application domain.
For example, suppose it was determined that an address was composed of multiple attributes; one of which is postal code. Technically a postal code is a composite item too (at least Canadian Postal Codes are). Further normalizing your database to recognize these facts would probably be an over-normalization. This is because the components of a postal code are irrelevant to your application and therefore factoring them into the database design would be an over-normalization.
If you have a project/piece of functionality that is very performance sensitive, it may be smart to denormalize the database in some cases. However, this can lead to maintenance issues for various reasons. You may instead want to duplicate the data with cache tables but there are drawbacks to this as well. It's really a case by case basis but in normal practice, database normalization is a good thing. 99% of the non-normalized databases I've seen are not by design, but rather by a misunderstanding/mistake by the developer.
I can only add one more constructive note to the answers already posted here. However you choose to normalize your database, that very process becomes almost trivial when the addresses are standardized (look the same). This is because as you endeavor to prevent duplicates, all the addresses that are actually the same do look the same.
Now, standardizing addresses is not trivial. There are CASS services which do this for you (for US addresses) which have been certified by the USPS. I actually work for SmartyStreets where this is our expertise, so I'd suggest you start your search there. You can either perform batch processing or use the API to standardize the addresses as you receive them.
Without something like this, your database may be normalized, but duplicate address data (whether correct or incomplete and invalid, etc) will still seep in because of the many, many forms they can take. If you have any further questions about this, I'll personally assisty you.
This isn't really what I understand by normalisation. You don't seem to be talking about removing redundancy, just how to partition the storage or data model. I'm assuming that the example of addresses for Accommodation, Train Stations and Airports will all be disjoint?
As far as I know it would only be normalisation if you started thinking along the lines. Postcode is functionally dependent upon street address so should be factored out into its own table.
In which case this could be ever desirable or undesirable dependent upon context. Perhaps desirable if you administer the records and can ensure correctness, and less desirable if users can update their own records.
A related question is Is normalizing a person’s name going too far?