Design Patterns for Event-Driven Systems
Useful patterns to consider when designing distributed systems to handle events
This is the second of three instalments of my series on event-driven architectures and reactive systems. Here are part one and part three.
Event-driven systems are experiencing a resurgence in interest with the rise of microservices. According to Wikipedia,
Event-driven architecture (EDA) is a software architecture paradigm promoting the production, detection, consumption of, and reaction to events.
In this article, I am going to share a set of useful design patterns that can be applied to building event-based systems. They can be used individually or mixed and matched depending on the requirements and constraints of the solution under construction.
Event Notification
A component/service sends event messages to notify others of a change in its state. A key element of event notification is that the producer does not really care much about the response. Often it does not expect any answer at all, or if there must be a response, it would be asynchronous and handled by a different logic flow from the one that sent the event in the first place.
An event message does not need to carry much data on it, usually just some core information and an address back to the producer to request for more information. The consumer gets the core information on the nature of the change and might then query the producer for more information, if need be.
Event-Carried State Transfer
This pattern is used when one wants to update consumers in such a way that they do not need to contact the producer back to request further information. For example, a Customer service might publish events whenever customers change their addresses with events containing the changed data. An Order service might then update its own copy of Customer data so that it can use the new address to process new orders.
Claim Check
A messaging-based architecture must be able to transmit large messages (for example image or sound files, large text documents, or any kind of binary data). Most of the message brokers are optimised to handle huge quantities of small messages rather than large ones. Also, most messaging platforms such as AWS SQS or Azure Event Hubs impose limits on message size.
A common workaround is to store the entire message payload into an external service, such as a blob store or a database. Get the reference to the stored payload, and send just that reference to the message broker. The reference acts like a claim check used to retrieve a piece of luggage. Consumers interested in processing that specific message can use the obtained reference to retrieve the payload.
Event Sourcing
Instead of storing just the current state of the domain data in the system, we use an append-only event store to record the full series of changes on that data. That is, whenever one makes a change to the domain state, that change is recorded as an event so that they can rebuild the system state by reprocessing the events at any time in the future. The event store becomes the system’s source of truth, and the system state is purely derived from it. The classic example of event sourcing is a bank ledger with deposits and withdraws. For programmers, another good example is a version-control system such as Git.
Despite being a bit unusual pattern, it has increasingly been adopted in microservice architectures for being one possible solution to a challenging problem: upon receiving a command, a service typically needs to atomically update an aggregate in its database and publish an event. The database update and sending of the message must be atomic in order to ensure data consistency. However, it is neither usually viable nor desirable to use distributed transactions spanning databases and message brokers when implementing microservices.
Event sourcing can be a solution to this problem as an event store is sort of a hybrid between a database of events and a message broker. It provides an API that enables services to store, retrieve and subscribe to events. When a service saves an event in the event store, it is delivered to all interested subscribers.
Choreography vs Orchestration
Event-based systems are nice because they promote low coupling, either in space (location transparency) or in time (temporal decoupling). However, it can become difficult to make sense of the overall context if there is a logical flow (business transaction) that spans various event notifications. Although each individual component is simpler, the complexity of the interactions is higher. This is made worse because these interactions are not written in the source code and hence the resulting flow is not explicit in any codebase. The only way to figure out this flow would be observing a live system. Debugging and maintenance get harder. It is very easy to make nicely decoupled systems with event-based choreography, without realising that we are losing sight of larger-scale flows of events.
Orchestration was the traditional way of coordinating interactions between different services in Service-Oriented Architecture (SOA). There was typically one “orchestrator” (aka Process Manager) that controls the overall service interactions
A hybrid approach is to use an orchestrator/coordinator to drive the flow between services. The coordinator produces commands to the event stream and the respective microservices consume them, perform some processing, and then produce more events to the event stream. In the example below, services A and C react at the same time. The coordinator consumes events from the event stream and may react driving further steps.
Here I would like to mention Spring Cloud Data Flow as an awesome implementation of the previous coordinator pattern.
Spring Cloud Data Flow provides tools to create complex topologies for streaming and batch data pipelines. The data pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks.
Saga
A Saga is used to implement business transactions that span multiple microservices. It is a sequence of local transactions, each within a single microservice, and publishes a message or event to trigger the next required local transaction. If a local transaction fails then the Saga needs to execute a series of compensating transactions that undo the local transactions already committed.
And why is all this needed? Well, as each microservice has its own database, a Saga is used to maintain overall data consistency (business rules) without resorting to distributed transactions. That is why some call Sagas as eventually consistent distributed transactions.
Leveraging our knowledge from the previous session, we conclude that there are two ways of coordinating sagas:
- Choreography — each local transaction publishes domain events that trigger local transactions in other services
- Orchestration — an orchestrator/coordinator tells the participants what local transactions to execute
To Be Continued
We have covered the major design patterns commonly used to build event-driven systems. Hopefully you found it useful. Next time, I am going to share a list of best practices when adopting event-driven architectures. Stay tuned.