After this comment to one of my question, I\'m thinking if it is better using one database with X schemas or vice versa.
My situation: I\'m developing a web applicat
A PostgreSQL "schema" is roughly the same as a MySQL "database". Having many databases on a PostgreSQL installation can get problematic; having many schemas will work with no trouble. So you definitely want to go with one database and multiple schemas within that database.
Definitely, I'll go for the one-db-many-schemas approach. This allows me to dump all the database, but restore just one very easily, in many ways:
Otherwise, googling around I've seen that there is no auto-procedure to duplicate a schema (using one as a template), but many suggest this way:
I've written two rows in Python to do that; I hope they can help someone (in-2-seconds-written-code, don’t use it in production):
import os
import sys
import pg
# Take the new schema name from the second cmd arguments (the first is the filename)
newSchema = sys.argv[1]
# Temperary folder for the dumps
dumpFile = '/test/dumps/' + str(newSchema) + '.sql'
# Settings
db_name = 'db_name'
db_user = 'db_user'
db_pass = 'db_pass'
schema_as_template = 'schema_name'
# Connection
pgConnect = pg.connect(dbname= db_name, host='localhost', user= db_user, passwd= db_pass)
# Rename schema with the new name
pgConnect.query("ALTER SCHEMA " + schema_as_template + " RENAME TO " + str(newSchema))
# Dump it
command = 'export PGPASSWORD="' + db_pass + '" && pg_dump -U ' + db_user + ' -n ' + str(newSchema) + ' ' + db_name + ' > ' + dumpFile
os.system(command)
# Rename back with its default name
pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template)
# Restore the previous dump to create the new schema
restore = 'export PGPASSWORD="' + db_pass + '" && psql -U ' + db_user + ' -d ' + db_name + ' < ' + dumpFile
os.system(restore)
# Want to delete the dump file?
os.remove(dumpFile)
# Close connection
pgConnect.close()
A number of schemas should be more lightweight than a number of databases, although I cannot find a reference which confirms this.
But if you really want to keep things very separate (instead of refactoring the web application so that a "customer" column is added to your tables), you may still want to use separate databases: I assert that you can more easily make restores of a particular customer's database this way -- without disturbing the other customers.
I would say, go with multiple databases AND multiple schemas :)
Schemas in PostgreSQL are a lot like packages in Oracle, in case you are familiar with those. Databases are meant to differentiate between entire sets of data, while schemas are more like data entities.
For instance, you could have one database for an entire application with the schemas "UserManagement", "LongTermStorage" and so on. "UserManagement" would then contain the "User" table, as well as all stored procedures, triggers, sequences, etc. that are needed for the user management.
Databases are entire programs, schemas are components.
In a PostgreSQL context I recommend to use one db with multiple schemas, as you can (e.g.) UNION ALL across schemas, but not across databases. For that reason, a database is really completely insulated from another database while schemas are not insulated from other schemas within the same database.
If you -for some reason- have to consolidate data across schemas in the future, it will be easy to do this over multiple schemas. With multiple databases you would need multiple db-connections and collect and merge the data from each database "manually" by application logic.
The latter have advantages in some cases, but for the major part I think the one-database-multiple-schemas approach is more useful.