I am working on the design of a database that will be used to store data that originates from a number of different sources. The instances I am storing are assigned unique I
I personally find composite primary keys to be painful. For every table that you wish to join to your "sources" table you will need to add both the source_id and id_on_source field.
I would create a standard auto-incrementing primary key on your sources table and add a unique index on source_id and id_on_source columns.
This then allows you to add just the id of the sources table as a foreign key on other tables.
Generally I have also found support for composite primary keys within many frameworks and tooling products to be "patchy" at best and non-existent in others
Composite keys are tough to manage and slow to join. Since you're building a summary table, use a surrogate key (i.e.-an autoincrement/identity column). Leave your natural key columns there.
This has a lot of other benefits, too. Primarily, if you merge with a company and they have one of the same sources, but reused keys, you're going to get into trouble if you aren't using a surrogate key.
This is the widely acknowledged best practice in data warehousing (a much larger undertaking than what you're doing, but still relevant), and for good reason. Surrogates provide data integrity and quick joins. You can get burned very quickly with natural keys, so stay away from them as an identifier, and only use them on the import process.
I believe that composite keys create a very natural and descriptive data model. My experience comes from Oracle and I don't think there is any technical issues when creating a composite PK. In fact anyone analysing the data dictionary would immediately understand something about the table. In your case it would be obvious that each source_id must have unique id_on_source.
The use of natural keys often creates a hot debate, but people whom I work with like natural keys from a good data model perspective.
Pretty much the only time I use a composite primary key is when the high-order part of the key is the key to another table. For example, I might create an OrderLineItem table with a primary key of OrderId + LineNumber. As many accesses against the OrderLineItem table will be "order join orderlineitem using (orderid)" or some variation of that, this is often handy. It also makes it easy when looking at database dumps to figure out what line items are connected to what order.
As others have noted, composite keys are a pain in most other circumstances because your joins have to involve all the pieces. It's more to type which means more potential for mistakes, queries are slower, etc.
Two-part keys aren't bad; I do those fairly often. I'm reluctant to use a three-part key. More than three-parts, I'd say forget it.
In your example, I suspect there's little to be gained by using the composite key. Just invent a new sequence number and let the source and source key be ordinary attributes.
Some people recommend you use a Globally Unique ID (GUID): merge replication and transactional replication with updating subscriptions use uniqueidentifier columns to guarantee that rows are uniquely identified across multiple copies of the table. If the value if globally unique when it's created, then you don't need to add the source_id to make it unique.
Although a uniqueid is a good primary key, I agree that it's usually better to use a different, natural (not necessarily unique) key as your clustered index. For example if a uniqueid is the PK which identifies employees, you might want to clustered index to be the department (if your select statements usually retrieve all employees within a given department). If you do want to use a unqiqueid as the clustered index, see the NEWSEQUENTIALID() function: this creates sequential uniqueid values, which (being sequential) have better clustering performance.
Adding an extra ID column will leave you having to enforce TWO uniqueness constraints instead of one.
Using that extra ID column as the foreign key in other referencing tables, instead of the key that presents itself naturally, will cause you to have to do MORE joins, namely in all the cases where you need the original soruce_ID plus ID_on_source along with data from the referencing table.