In my day to day job, I cover almost all facets of the software development lifecycle. From architecting systems right through to delivering and supporting them. The company I work for is small (there are two of us), so we both fulfill a variety of roles.

I’m currently in the very early stage of the design and development of a new project: a cloud-based solution for receiving emails, sending them to a recognition service and then exporting the results to a 3rd party API.

The solution will run in Microsoft Azure, and be built using microservices.

Partly to sense check myself (readers, you are my honorary rubber duck for the next few posts) and partly to give people a view into the world of designing a system I’ve decided to chart the whole lifecycle of this project through blog posts.

A couple of things before I get into the nuts and bolts:

  • I’m still learning the best practices and design principles for building microservices. If you’re reading this and see something I could do better, please comment and let me know
  • I’ve obscured some of the details of the project as this is a system that will be used in real life. Where I can, I’ll keep things as true to real life as possible.

The Case Study

So what are we building? The software my company implements is based around Optical Character Recognition (OCR) software.

In a nutshell, we read data from documents and do useful shit with it.

The client we have engaged with, receive hundreds of emails per day containing booking details from various different suppliers. The emails could contain attachments with the booking details, or the details could be in the body of the email.

To quickly summarise the list of our requirements:

  • Receive email contents sent via Cloud Mail In and parse both the data and attachments into storage
  • Send the physical files to the OCR engine for recognition
  • Receive a response from the OCR software containing the booking data Pass the booking data on to the clients API *For ease, I’m going to leave the specifics of the OCR engine vague. However, if you are interested we will be using <a href="https://www.abbyy.com/en-us/flexicapture-cloud/" target="blank" rel="noreferrer noopener" aria-label="ABBYY Flexicapture Cloud (opens in a new tab)">ABBYY Flexicapture Cloud._

Initially, we will be dealing with a relatively low volume of emails. But this will ramp up quite quickly once in production. Because of this, the components of the system need to be easily scalable to cope with both an increase and decrease in volumes.

Why Microservices?

Aside from the fact that microservices are the ‘hip’ way of building software at the minute, this kind of cloud-based system fits perfectly with microservice architecture.

Scaleability

One of my favorite things about microservices and the way in which they communicate is scaleability. As the volume of inbound emails increases, I can simply add more instances of a ‘mailhandler’ service to cope with the volume.

As the mailhandler service will be completely stateless, running one or one hundred instances really doesn’t matter.

If this system was a traditional monolith with all the functionality in one system scaling is difficult. Running multiple instances doesn’t work as well. The only real option is more power, which can quickly become expensive and will still hit limits eventually.

Disparate Components

Reading the list of functional requirements, there are actually quite clear lines between the different parts of the system.

  • Receive emails
  • Send for OCR
  • Receive from OCR
  • Send to a 3rd party API

The code receiving emails has no reason at all to care about index information returned from the OCR system, or how that data is passed on to a 3rd party API.

All it cares about is how to parse an inbound email, and then to tell the world about it.

So we’ve looked at the why, let’s get into the how.

Designing with Microservices

I always follow the same process when designing microservice-based architectures. I follow a principle taken from the fantastic book The Tao Of Microservices by Richard Rodger.

Messages

Microservices should be designed with messages as first-class citizens. Understanding the messages your system needs to process gives an incredible starting point for understanding the components required.

NameDescriptionSynchronous (S) or Asynchronous (A)
inboundemail:receiveReceive a new request for processing from CloudMailInS
datastore:saveSave the email index data to a datastoreS
datastore:listList data in the data storeS
datastore:addindexdataAdds OCR indexed data to the datastoreA
documentstore:saveSave inbound email attachmentsS
documentstore:retrieveRetrieve inbound email attachmentsS
ocr:createCreate a new OCR taskS
ocr:receiveReceive a completed response from the OCR engineS
export:sendSend OCR indexed data to be exportedS
info:newdocumentRaise an event notifying that a new document has been savedA
info:documentsentforocrRaise an event notifying that a document has been sent for OCRA
info:documentfailedforocrRaise an event notifying that a document has failed to be sent for OCRA
info:indexdatareceivedRaise an event notifying that new index data has been receivedA

I complete a table like this for every new system I design. It covers the different messages I think will need to be passed around, and the synchronicity required.

Synchronicity is a tricky one. The general rule of thumb I use is if the sender cares about the response it should be a synchronous call, if not then just fire and forget.

For example, when a new request is received from Cloud Mail In the API needs to notify Cloud Mail In that the request has been successful. That needs to be synchronous.

However, when a document is sent to the OCR engine there are no services that directly care. So a message can be raised stating the fact a new document has been sent. Any services that care can listen in for the message and act if required.

After messages, come activities.

Activities

Messages on their own don’t give enough detail about the actual functionality and interactions required. For that, we need to look at how different sets of messages link together to create a flow through the system.

NameMessage Flow
New Inbound Email1. inboundemail:receive
2. datastore:save
3. documentstore:save
4. info:newdocument
Send for OCR1. info:newdocument
2. documentstore:retrieve
3. ocr:create
4.info:documentsentforocr OR info:documentfailedforocr
Receive data from OCR1. ocr:receive
2. datastore:addindexdata
3. info:indexdatareceived
Export document1. info:indexdatareceived
2. export:send

The above table covers the processes through the system as a whole. It should relate very closely to the list of functional requirements.

Activites give a good indication of who needs to listen to who and where the interactions lie. It also gives a reasonably logical grouping to actually design the services themselves.

Services

Now we are on the fun part, the services themselves.

When I first started designing microservices, I would always jump straight in at the services. It isn’t the worst place to start. But I found there would be a lot of chopping, changing and re-factoring until the correct bounded contexts were found.

Flipping the design on it’s head, and starting with the messages as we have here, makes the services a lot clearer.

NameDescriptionSendsReceives
frontHandles external HTTP requests and sits behind a load balancerinboundemail:receive
ocr:receive
datastore:list
inboundTranslates the inbound request into formatted data and saves any physical file attachmentsdatastore:save
documentstore:save
info:newinbounddocument
inboundrequest.post
request-storeHandles persistance of request and index datadatastore:save
datastore:list
datastore:addindexdata
document-storeHandles persistance and retrival of physical filesdocumentstore:receive
documentstore:retreive
ocrHandles moving of documents from storage to the OCR enginedatastore:list
documentstore:retrieve
ocr:create
info:documentsentforocr
info:documentfailedforocr
info:newinbounddocument
ocr-receiptReceives index data from the OCR engine and translates to formatted datadatastore:addindexdata
info:indexdatareceived
ocr:receive
exportHandles exporting the indexed data to 3rd party systemsexport:sendinfo:indexdatareceived

As you can see, starting with messages gives very clear lines between the different services required.

Having the end two columns also helps clarify that every message and activity will be handled by someone.

In Summary

It’s the early stages of the development and deployment of this system.

I say that, knowing that the first build needs to be live by the 1st of October. Oh man, I do love a tight deadline!

As the system develops and as I get into the nitty-gritty of actually writing some code I’ll start to flesh out this post.

But until then, I hope this has given you a nice introduction to a fantastic three steps to designing any microservice-based system.

Remember:

  • Messages
  • Activities
  • Services

This post is also available on DEV.