I wanted to share my practical learning experience of Event Sourcing. There are numerous resources online talking about Event Sourcing, and you might wonder what is new in this post?! I promise there are no major breakthroughs in this post, however, the main goal is to present ES in its simplest form. One of the issues with most of the online resources, ES is described in the context of CQRS and DDD, which is daunting for new comers. Don’t get me wrong, CQRS & DDD are great patterns and design choices packed with merits, but ES is still feasable without them . So for now please forget about these auxiliary patterns for a bit. Hence, my promise to simplify ES is kept by excluding any auxiliary patterns.
Ensures that all changes to application state are stored as a sequence of events 
State transitions are an important part of our problem space and should be modeled within our domain 
In other words, the main difference compared to a traditional CRUD system, is that instead of persisting the most recent state, we persist all actions that describe a change to the state. This gives us the ability to rebuild state at any time by reading the persisted events from the single source of truth.
A main stream system utilises pattern is version control. It only keeps track of commit changes and all other information displayed to use is derived dyanamically. Image below depicts a system that keeps track of user balance w/o event sourcing pattern:
The next two sections, in my humble opinion, are very important to get grasp of so you can start your development journey on an event driven system.
The common components across different ES implementations are:
- An action that took place in the past
- Carries relevant information e.g. timestamp, type, payload, or instance id
- Their origin can be internal (e.g. UI interactions) or external (e.g. data stream) to the system
- Event Store
- All events are persisted in an event store e.g. relational or non-relational database
- Event store is the only single point of truth
- Events are added by an append operation only and are ordered in a chronological order
- Events are immutable i.e. should not change once persisted.
- The event store or another service can be responsible of events publishing
Of course, in order to implement a fully working system there are other parts involved, like message topology (queue vs pub/sub) that defines how events are published and consumed, database deployment and model choices, etc… but can only be defined best based on the problem we solve.
Obviously we need to describe how and when the moving parts dance together. Below are common operations you will end up needing once you start development.
- Since the event store is the only point of truth all the time, the business logic would constantly query the event store
- Queries are usually complex and expensive as the event store grows
- Its about capturing the state of the system for reuse instead of rebuilding state on every query
- Massively improve query performance hence data consistency. Because it reduces the number of events to replay during state rebuilding.
- Snapshot criteria? How often you create a snapshots? The answer is domain dependant. It can be every night, every 1GB, or any other system thresholds/conditions
- Rebuilding the state of the application by traversing the events from any point in time
- Use projections to build the read model (usually involves complex querying). The read model can act as a cache to optimise subsequent queries (be careful of cache invalidation)
- It will lead to an Eventual Consistent based design due to latencies from the instance the event arrives to the system, persisted, published, and manipulated by business rules to be ready for consumption
The next figure depcits how componets and techniques are glued together in practice. Its a bit more complex than previous figure, but at this stage you should be able to dive a bit deeper. Trust me, once you get into the mindset of ES and start thinking through your design you will find yourself naturally heading this direction.
To ellaborate on the above workflow and crystalise what we talked about so far, lets consider the following scenario where an ES pattern is chosen to be implemented for Racing Car Telemetry system to visualise lap times. More details are provided in the next post.
- A car is driving around the track. Once it crosses the start line a
LapTriggered event is sent.
- Our system receives the LapTriggered event with a timestamp. (This is an external event arrived from 3rd party infrastructure)
- The event gets persisted to the EventStore in an append operation
- The event gets published to its subscribers via the chosen messaging topology
- Upon arrival, the subscriber runs a query against the event store database to retrieve all LapTriggered events since the last EngineStarted event
- From every two consecutive lap events we can deduce last lap duration by subtracting their timestamps
- Lap1 Duration = LapTriggered2 – LapTriggered1
- A lap model is matrialised into the read database
- A UI consumes matrialised views to display a list of laps completed by a car
- A snapshot is created on every EngineStopped event marking the end of a run
NOTE: By the way, what we described above is pretty much the CQRS & ES patterns. I told you, things will naturally come togeher.
Ok, I am getting all the above, but why and when should I use Event Sourcing? Whats wrong with more traditional approaches? Some of the key benefits, but not the only ones are:
- Audit trailing capabilities required by the domain
- Provides excellent insights during troubleshouting
- Most of the time the event store (database) deals only with append operations because events are immutable. Hence, low data access contention which leads to better performance hence scalability.
- Utilise in-memory architecture (The LMAX Architecture): By having the privilege to derive the latest state at any time allows us to design a system that has a minimal reliance on accessing the persistence layer (IO bound bottleneck). The high throughput is achieved because all number cranching logic takes place in memory after bootstrapping the application state from a snapshot db, and allows different consumers to consume data at different rates.
- Flexible decoupled architecture
- Once you start design & developing an ES based system, you will naturally follow the path of creating modularised components of the system, in other words it suits the microservices architecture where each component is autonomous to a degree
I really hope, so far, you found the information in this post useful. I am saying “so far” because there are still more on the matter. Remember, I tried to keep my promise from the outset by keeping things simple, but I have to be pragmatic and draw your attention to the following issues you’re going to face:
- How to deal with invalid events and who should validate them?
- How to deal with out of order or missing events?
- How event compensation can help me here?
- Can the infrastructure guarantee message delivery?
- What about adding a squence number to event message?
- How to deal with duplicate events?
- How idempotancy can help me here? Or a duplicate message detection mechanism?
- How to deal with event version update? E.g. a new requirement to change/add fields to event payload or the event becomes redundant
- Versioning in an Event Sourced System, Greg Young
- Should I add a version to the event object?
- Maybe add a stage to the publish pipeline to convert every message to latest version?
To better crystalise what we covered in this post, its time to get our hands dirty and try to build a demo project using ES pattern. The next post will show that.