Garbage Collector: September 2014

Thursday, September 18, 2014

Composing Multiple SubDomains

During the past weeks I was bothered with the question what is the best practice in case you have a client consuming multiple services (DDD subdmains) into a single view.

Is it better to implement a front service which receives a single call from the client, sends requests back to each relevant back-end service and aggregates the response back to the client?

or whether it would be best having the client send the requests directly to these services?

Please note that for the sake of simplification I abstracted the above diagrams and ignored logical (presentation, application, domain) and physical (fireawalls, load balancers etc.) tiers irrelevant to this discussion.

IMHO, There isn't a straight answer but pros and cons that should be considered per design.

Aggregated Front Service

I/O, Roundtrip and Resources

When it comes to aggregated front service, one of the obvious wins here is dramatically reducing the client-server round-trip.

And why is that so great?

Remote calls by nature tend to fail from time to time due to many reasons: service availability, connectivity issues etc.

When talking about a front api with clients out there on the internet it gets even worse as we have a lot of moving parts in the client to server way - with each part increasing the probability of a request to fail.

Let's say for example that one call success chance is 99.99%. When a client needs 3 calls in order to create it's final view - we drop down to 99.97% success probability (99.99% X 99.99% X 99.99%). With every single remote call added to a client's dependencies we increase the probability of it to fail.

(I should mention that the same concept works for putting components in a series or doing synchronous calls between services as covered in the great Scalability Rules book).

Another reason to implement such service is the limited browser and mobile devices resources.

In cases where you have an ecosystem built with a large amount of services you might find clients having a hard time dealing with sending many parallel requests as they are limited when it comes to resources. Another case is when the aggregation determines heavy computational resources (data processing, manipulation etc) which only the back end servers can supply.

Furthermore, When it comes to I/O, buffering/aggregating on I/O operations increases resource utilization (due to the overhead of every operation) thus increasing server throughput - which is a very big win!

Design

In cases where there is some business logic or data manipulation needed in order to answer a client's request, a front service might be good also from a design perspective. Business logic dedicated to composing multiple services might imply you are looking at a whole new sub-domain/SOA Service with it's own requirements, resources and entities (as described in Vaughn Vernon's Implementing Domain Driven Design).

A good example of that is eToro's stats service which collects data from multiple services (social, trading etc.) and builds modules and recommendation system on top of it. This service is a whole new world with it's own ubiquitous language, entities, aggregates etc.

The common denominator

There is one common thing to all above cases: the assumption that we always need to go to each and every one of the back end services in order to build our view. It is a very important thing to mention as will be explained later.

Direct Client Calls

Caching

The above mentioned Scalability Rules book emphasizes the importance of caching when it comes to scalability - or as Abbot and Fisher call their caching rules set: Use Caching Aggressively!

When it comes to front services scalability, some of the most common and important tools you should leverage is the Reverse Proxies and Content Delivery Networks (CDN). Both of these mechanisms relay on the fact that some responses/resources can be cached - as they are valid for some certain amount of time - meaning they are partially/fully static. These tools allow you to cache your responses on a front server which will answer some of your traffic - reducing the actual hit and load on your back end systems.

When aggregating the response of multiple services by a service and by that ignoring whether their response is dynamic or static - you are disabling the opportunity to leverage on these important scalability tools!

Think of it this way: What if some of your heavy resources/calls can be geographically cached for your clients - so once a client requests some resource he actually gets the response from a server a few miles away - You both significantly increase the probability of call success and gain scalability for your system!

Coupling and Dependencies

When the client side is responsible for the composition you are basically creating a back-end that doesn't have to be aware of client views - all it needs to do is properly reflect it's own state and operations. Creating a middle tier composing these services makes another part of your system aware of that composition which creates another coupling.

Furthermore, as mentioned in Vernon's book, there are systems where their client view is basically a composition of "widgets" of different sub domains/services (portal-portlet style) - like Amazon's checkout page, where you see payments data side by side with shipping and other book recommendations. In this post Udi Dahan explains how having this kind of client composition clears out your services boundaries - which is an enormous design achievement!

Last, another level of dependency is the maintenance. This extra tier is another tier that should be maintained, sometimes between teams (usually happens when teams are divided by subdomains) which usually implies painful cross teams synchronization.

Summary
As I said, Its really hard to say right or wrong as every project has it's constraints.
IMHO, generally speaking, I would go for client side composition in most cases in order to increase scalability and simplicity and reduce coupling. Only in cases when there is a lack of resources or when there is a descent business logic that you wouldn't want the client side to implement - I'd go with the front aggregation service.

Udi Dahan recorded a great video post on composing multiple services using IT/OPS component.

Wednesday, September 10, 2014

Serialization Benchmark

A few months ago, as a part of eToro's architects forum, I took a task of doing a performance benchmark on serializers commonly used in the industry.

The reason I had to redo the benchmark and not just look it up online is because I couldn't find a benchmark that includes all the serializers I wanted to test - This is exactly why I'm posting this online, maybe this will save someone else's time.

As far as for trade offs we thought of when choosing the best serializer(s) for our development teams we considered the following:

(De)Serialization performance
Payload size
Cross-platform ability
Readability of the serialized result (ease things up on debugging and logging reads).

So, for us to get a wide view on things I tested both string based and binary serializers.
In the string section:

In the binary section:

Protobuf-net
DataContractSerializer (set up to binary)
XMLSerializer (set up to binary)
BinaryFormatter (just for reference, as it isn't cross platform and known for it's poor performance due to it's extensive usage of reflection).

In order to choose the right metrics I looked up in Sasha Goldshtein's awsome book who did a very similar benchmark.

The Test
In order to do the benchmark I used AdventureWork's top 100K Sales.SalesOrderDetail rows.
I loaded these rows into this POCO:

The reason for the BaseMessageClass and the inheritance is our common use case where we have one endpoint in the code that deserializes the bytes into a base class and then dispatches it according to it's actual type.

For every serializer I simply measured using Stopwatch the time to serially serialize 100K objects into a memory stream and then measured the time to serially deserialize the output of the previous stage into the base message.

Between each test I ran GC.Collect() in order to avoid GC during the measurements.

I did also try to measure every single serialization and deserialization operation but as the results on the fast serializers were mostly under a millisecond I got no relevant results there.

The Results

And the Winners

When it comes to performance and payload size Protobuf-net is without a doubt the best serializer by a magnitude.

If you insist of having a readable payload (or insist of Json as it is pretty much a standard these days when it comes to web apis) and wish to compromise as least as you can on performance I guess ServiceStack's JsonSerializer is a good choice for you with a descent performance and reasonable payload size.

Sunday, September 7, 2014

Static Ctor

Hi there!

This is my first blog post... (woohoo!!)

My name is Moti. I'm 28 years old, a father of a lovely three years old girl, married to a beautiful wife and pretty much feeling blessed - so first of all, thank you god!

A friend of mine, David Virtser, suggested I should start my own blog In which I will post some of the things, thoughts, bits and bytes I'm dealing with in my software engineering practice that I personally think are worth sharing...

I've been practicing software engineering for the past 9 years and enjoying every second of it. I see myself as a very passionate developer - I spend a lot of my free time reading software books and articles, watching lectures etc. This is pretty much my biggest hobby - can't really explain how it turned out this way, but I guess I got addicted to it.

I'm mostly interested in software architecture (SOA and distributed systems), design (DDD, patterns and best practices) and performance (specially in a Microsoft oriented environment - but not only) - all from the enterprise applications perspective.

I'm not really sure how this blog will eventually turn out to be, but as I see it now I will mainly post for myself the stuff I will want to look up later.

Most of the things I will post will basically rely one way or another on smarter and more experienced people than me such as the great Sasha Goldstein, The author of the best book a .net developer can get Jeffrey Richter, DDD Guru Vaughn Vernon, SOA and distributed systems mentor Udi Dahan and many others I try my best to learn as much as I can from. (This is a good opportunity to thank them for all the times they took the time to answer my annoying emails and questions).

I hope this blog will at least help me to clear my thoughts a little...

Good luck to us all...