Enforce invariants spanning multiple aggregates (set validation) in Domain-driven Design

问题

To illustrate the problem we use a simple case: there are two aggregates - Lamp and Socket. The following business rule always must be enforced: Neither a Lamp nor a Socket can be connected more than once at the same time. To provide an appropriate command we conceive a Connector-service with the Connect(Lamp, Socket)-method to plug them.

Because we want to comply to the rule that one transaction should involve only one aggregate, it's not advisable to set the association on both aggregates in the Connect-transaction. So we need an intermediate aggregate which symbolizes the Connection itself. So the Connect-transaction would just create a new Connection with the given components. Unfortunately, at this point the troubles begin; how can we ensure the consistency of connection-state? It may happen that many simultaneous users want to plug the same components at the exact same time, so our "consistency check" wouldn't reject the request. New Connection-aggregates would be stored, because we only lock at aggregate-level. The system would be inconsistent without even knowing that.

But how should we set the boundary of our aggregates to ensure our business rule? We could conceive a Connections-aggregate which gathers all active connections (as Connection-entity), thereby enabling our locking-algorithm which would properly reject duplicate Connect-requests. On the other hand this approach is inefficient and does not scale, further it is counter-intuitive in terms of domain language.

Do you know what I'm missing?

Edit: To sum up the problem, imagine an aggregate User. Since the definition of an aggregate is to be a transaction-based unit we are able to enforce invariants by locking this unit per transaction. All is fine. But now a business rule arises: the username must be unique. Therefore we must somehow reconcile our aggregate boundaries with this new requirement. Assuming millions of users registering at the same time, it becomes a problem. We try to ensure this invariant in a non-locked state since multiple users means multiple aggregates.

According to the book "Domain-driven Design" by Eric Evans one should apply eventual consistency as soon as multiple aggregates are involved in a single transaction. But is this really the case here and does is make sense?

Applying eventual consistency here would entail registering the User and afterwards checking the invariant with the username. If two Users actually set the same username the system would undo the second registering and notify the User. Thinking about this scenario disconcerts me because it disrupts the whole registering process. Sending the confirmation e-mail, for example, had to be delayed and so forth.

I think I'm just forgetting about something in general but I don't know what. It seems to me that I need something like invariants on Repository-level.

回答1:

We could conceive a Connections-aggregate which gathers all active connections (as Connection-entity), thereby enabling our locking-algorithm which would properly reject duplicate Connect-requests. On the other hand this approach is inefficient and does not scale, further it is counter-intuitive in terms of domain language

On the contrary, I think you're on the right track with this approach. It seems convoluted because you're using an example that doesn't make any sense - there is no real-life system that checks if a lamp is connected to more than one socket or a socket to more than one lamp.

But applying that approach to the second example would lead you to ask yourself what the "connection" aggregate is in that case, i.e. inside which scope a user name is unique. In a Company? For a given Tenant or Customer? For the whole <whatever-subdomain-youre-in>System? Find the name of the scope and there you have it - an Aggregate to enforce the unique name invariant. Choose the name carefully and if it doesn't exist in the ubiquitous language yet, invent a new concept with the help of a domain expert. DDD is not only about respecting existing domain terms, you're also allowed to introduce new ones when Breakthroughs are achieved.

Sometimes though, you will find that concurrent access to this aggregate is too intensive and generates problematic contention. With domain expert assent, you can introduce eventual consistency with a compensating action in case of conflict - appending a suffix to the nickname and notifying the user, for instance. Or you can split the "hot" aggregate into smaller, smarter, more efficient ones.

回答2:

The problem you are describing is called set validation. Greg Young makes a very good point that a key question is whether or not the cost/benefit analysis justifies enforcing this constraint in code.

But let's suppose it does....

I find it's most useful to think about set validation from the perspective of an RDBMS. How would we handle this problem if we were doing things with tables? A likely candidate is that we would have some sort of connection table, with foreign keys for the Lamp and the Socket. Then we would define constraints that would say that each of those foreign keys must be unique in the table.

Those foreign key constraints span the entire table; which is the database's way of telling us that the entire table represents a single aggregate.

So if you were going to lift those constraints into your domain model, you would do so by making an aggregate of all connections, so that the domain model can immediately rule on whether or not a given Lamp-Socket connection should be allowed.

Now, there's an important caveat here -- we're assuming that the domain model is the authority for connections between lamps and sockets. If we are modeling lamps in the real world connected to sockets in the real world, then its important to recognize that the real world is the authority, not the model.

Put another way, if the domain model gets conflicting information about the real world (two lamps are reportedly connected to the same socket), the model only knows that its information about the world is incorrect -- maybe the first lamp was plugged in, maybe the second, maybe there's a message missing about a lamp being unplugged. So in this sort of case, it's common that you'll want to allow the conflict, with an escalation to a human being for resolution.

the username must be unique

This is the single most commonly asked variation of the set validation problem.

The basic remedy is the same: you now have a User Profile aggregate, with an identifier, and a separate user name directory aggregate, which ensures that each name is uniquely associated with a profile.

If you aren't worried that a profile has at most one user name linked to it, then there is another approach you can take, which is to introduce an aggregate for each user name, which includes the profileId as a member. Thus, each aggregate can enforce the constraint that the name can only be assigned if the previous assignment was terminated.

I think I'm just forgetting about something in general but I don't know what.

Only that constraints don't come from nowhere -- there should be a business motivation for them; and somebody (the domain expert) should be able to document the cost to the business of failing to maintain the proposed set constraint.

For instance, if you are already collecting an email address, do you really need a unique username? How much additional value are you creating by including username in the model? How much more by making it unique...?

If we plan an online game, for example, with millions of users which request games constantly, it's a real problem.

Yes, it is; but that may indicate that the game design is wrong. Review Udi Dahan's discussion of high contention domains, and his essay Race Conditions Don't Exist.

A thing to notice, however, is that if you really have an aggregate, you can scale it independently from the rest of your system. One monster box is dedicated to managing the set aggregate and nothing else (analog: an RDBMS dedicated to managing a single table).

A more likely choice is going to be sharding by realm/instance/whatzit; in which case you'd have a smaller set aggregate for each realm instance.

回答3:

In addition to the suggestions already made, consider that some of these problems are very similar to database concurrency problems. Say that you have a contact, and one user changes the name, and another user changes the phone number for this contact. If you write a command that updates the whole contact with the state as it was after modification, then one of the two will overwrite the change of the other with the old value, unless measures are taken.

If, however, you write a 'ChangeEmailForContact' command, then already you will only change that one field and not have a conflict with the name change, which would similarly be a 'Name' or 'RenameContact' command.

Now what if two people change the email address shortly after the other? A really efficient way is to pass the original value (original email address) along with the new value in your command. Now you can check when updating the email address if the original email address was the same as the current email address (so it is a valid starting point), or if the new email address is the same as the current email address (no need to do anything). If not, then, only then, are you in a conflict situation.

Now, apply this to your 'set operation'. The first time a lightbulb is moved into a 'connection' (perhaps I would call it fixture), it is moving from unassigned to connection1. Then, when a lightbulb is moved, it must be moved from connection1 to connection2, say. Now you can validate if that lightbulb is already assigned, if it was assigned to connection1 or if something has changed in the meantime.

It doesn't solve everything of course, but for the tiny case that remains, that tiny moment where two initial assignments happen close enough together, you either have to go for say a redis cache of assigned usernames to validate against or give an admin an easy tool to solve this very rare instance. You could for instance make a projection that occasionally reports on such situations and make sure renaming isn't too painful.

来源：https://stackoverflow.com/questions/44283750/enforce-invariants-spanning-multiple-aggregates-set-validation-in-domain-drive

标签

domain-driven-design

aggregate

bidirectional

boundary