Best Practices in Moving to Event-Driven Architectures
How to increase chances of success and avoid common pitfalls
This is the last of three instalments of my series on event-driven architectures and reactive systems. Here are part one and part two.
Event-driven architecture (EDA) is experiencing a resurgence in interest with the rise of microservices. According to Wikipedia,
Event-driven architecture (EDA) is a software architecture paradigm promoting the production, detection, consumption of, and reaction to events.
In this article, I am going to share a set of useful best practices that I have learnt over the years working with enterprise application development and integration. They have already proved, and keep proving their value over and over again.
Here they are in no particular order:
1. Events are immutable
By definition events represent facts that have already happened. That is why they should be named as past simple verbs.
2. Communicate events asynchronously
Use some messaging infrastructure/fabric. A message broker with pub/sub capabilities is required, whether on-prem or managed in the cloud.
3. Producers must guarantee “at least once” delivery
Event producers and message brokers must work together to make sure that no message is accidentally lost.
4. Consumers must guarantee “at most once” delivery
Make sure your message broker will be able to filter out duplicated messages for your particular use case (e.g. a potentially long time window) before blindly counting on it. Otherwise event consumers either must be able to identify and discard retransmitted events (deduplication) or process events in an idempotent fashion.
5. Consumers must be able to handle out-of-order messages
Depending on your use case, ordered message delivery might be required. Message brokers offer the ability to ensure that messages are delivered in the order they are received. However, this can be expensive to support and, in fact, at times gives a false sense of security. In the end non-deterministic inputs will lead to nondeterministic outputs.
Event consumers must be designed so that message ordering can be relaxed and they still can eventually achieve a consistent view of the world. The overhead required to relax the ordering is nominal and in most cases is significantly less than enforcing ordering in the message broker.
6. Commands are different from events
Commands are messages that express intent (usually modelled as verbs). They can be transmitted asynchronously (fire and forget) but also synchronously (request-response). This is particularly important when a response is immediately required, e.g. the command might be rejected.
7. Define schemas for all events
In other words, adopt a contract-first approach. Be mindful when designing your events, they are as important as all other data stored on databases. For example, event granularity might affect the architecture’s overall performance.
8. Define a versioning policy from day 1.
Your business requirements will inevitably change and so the events exchanged across the architecture. Some useful tips are:
- Do not remove fields, deprecate them
- Provide default value when introducing a new field
- Introduce a new event should a breaking change be needed
9. Avoid Upcasting at all costs
Event upcasting means to transform it from its original structure to its new structure. Upcasters add technical debt IMHO. Some cons:
- The in-memory view of the event stream does not match the persisted state
- More moving parts, serialization can be broken
- They need to be maintained indefinitely and the number of upcasters does not decrease in general
- Performance considerations (depending on the complexity of the upcasting)
- Increased testing complexity (there are multiple combinations)
- Increased system complexity (merging or splitting events, augmentation with data from other sources, etc.)
10. Segregate event streams
Each service/aggregate owns its event stream.
11. Event Sourcing is not an architectural pattern, Event-Driven is
Event Sourcing is not even mandatory. If adopted, use it wisely as design pattern where it makes sense rather than everywhere.
12. Event Sourcing does not replace auditing, if it is required
Audit trails should include activities generated by users, by applications, and by the runtime environment itself. Auditing is supposed to allow administrators to answer the following questions:
- what happened?
- when did it happen?
- who initiated it?
- on what did it happen?
- where was it observed?
- from where was it initiated?
- to where was it going?
13. Use a proper event store if implementing Event Sourcing
Event stores are not just message brokers. In particular Kafka is not an event store as it does not support:
- loading events for a specific entity
- optimistic concurrency control which avoids data races due to concurrent requests against the same entity
14. Use DDD Aggregates
Consider your domain aggregates when modelling your architectural components/services. They cluster entities and value objects and define boundaries that govern transactions and distribution. An aggregate must not outgrow its service.
15. There must not be transactions that span events.
Events signal state transitions in a distributed system. That means events must leave a service/aggregate in a consistent state after being applied.
16. Embrace eventual consistency
Data integrity and consistency exist only within aggregates. Across boundaries, handle updates asynchronously.