Cross cutting concerns in microservices
Microservices architecture (as described by Martin Fowler here) is very popular so everyone will come across it sooner or later. Despite the size-focused name this architecture is about organization and functionality of the services, not about their sizes.
We will look into cross-cutting concerns to keep in mind when working with microservices. These concerns are not strictly specific to microservices architecture and can appear in other variations like n-tier and service-oriented architecture but given the popularity of microservices it makes sense to use them as an example.
When a request needs to go through multiple services to execute a business function or get results of data processing there are few things to keep in mind:
- Importance of correlation IDs
- Correct identification of callers/end users
- Cross-services API dependencies
- Throttling limits, caching expiration limits and latency can add up as the request goes via multiple services
Examples:
Trade processing workflow or some financial calculations can be good examples of a business request executed by multiple services.
A sample financial calculation would go through permissions check, metadata loading, resolution of calculation steps, some more metadata loading, the actual computation and delivering of the results.
Tracing and correlation ID
How to trace and gather metrics on a single business request level with microservices? That’s where correlation IDs and tracing are important.
They can be:
- Provided by the infrastructure. This works well if all components rely on the same infrastructure and protocol to send and receive messages but this is not always the case.
- Specified in the protocol or convention and has to be implemented in each service, e.g. X-Correlation-Id in case of HTTP headers. In addition to more development and maintenance efforts it also can get lost somewhere in the middle of the service calls stack because of a developer error or legacy implementation.
Advice: Correlation IDs are very important for full production use of the system and it is best to lay out the convention and rules regarding correlation of requests from the beginning.
Identification of callers
Correctly identifying an original caller is somewhat related to correlation of the requests but also a wider issue from maintenance perspective.
This problem is best illustrated with a picture.
Without correct instrumentation in place, S5 has visibility only to S4 as a caller and cannot quickly identify the source of issues, in this example S1.
In the picture above correct tracing of callers would help with troubleshooting but it is also important for overall system health monitoring and throttling.
Advice: Consider forcing the callers to identify themselves via issuing of tokens or application registration process/API from the beginning.
Cross-services API dependencies
Addition of new functionality to a set of services is a challenging task because there are two extreme sides to services API designs:
- Service APIs are very strict/well defined and each new piece of data to pass around requires a coordinated change to multiple APIs by potentially multiple teams.
- APIs are very flexible and easy to extend, e.g. key-value pair style but this leads to different sets of issues with testability.
In the example above, adding a new field would not be a problem if S4’s API was designed to be extensible from the beginning but this is not always the case. Often adding some new functionality requires a change in multiple services across the stack.
Advice: This comes down to API design practices of creating meaningful initial API with defined extension model.
Throttling and caching side-effects
Systems normally have one or multiple bottlenecks and often these bottlenecks are databases (excluding computation-heavy systems).
Database sharding comes with its own challenges and for some databases it is difficult to do sharding because of the nature of the data.
So two other methods often implemented for stability and performance:
- Throttling
- Caching
Both can cause unexpected aggregated side-effects when used in multiple services independently.
Throttling adds up
If multiple services in the system implement throttling independently, requests going through these services can be subjected to multiple levels of throttling and take longer than necessary.
With independent throttling every service is rate limiting based on it’s own metrics/knowledge about the request and has no true visibility on how busy are the resources underneath in the stack. This can lead to requests waiting unnecessarily on multiple levels.
That’s why good throttling algorithms are centralized, e.g. Token Bucket.
Advice: Be aware of the system’s overall workflow and throttling on different levels, consider using a single scheduler with centralized throttling.
Caching TTL adds up
Similar to throttling, if caching happens in multiple places through the stack the original caller may get unexpectedly stale data as cache expiration time adds up through the layers.
Advice: Be aware of where data is cached especially if freshness of the data is important and a caller does “write-read” requests pattern expecting to see it’s own changes.
Latency adds-up
With each service added to the stack there is network latency as it can potentially be running on different machines or even in the worst case in different datacenters. So adding new services to a system may improve throughput but will make a single request slower. This is especially noticeable if a request processing time is small compared to network latency.
Advice: Contrary to popular belief, monolith is not always bad and microservices are good. All depends on the system’s use case.
Summary
When using microservices architecture a developer should know how throttling and caching are implemented on the system level and make sure that requests are possible to correlate and trace between services.