Extreme Scaling your Applications and Micro-Services

 

http://beyondj.com

 

When we started BeyondJ project, it was out of frustration. We were frustrated because we could not find a suitable JVM platform on which we could easily deploy, manage and monitor our micro-services and our monolithic applications. We wanted to scale to thousands of instances of our applications easily and maintain visibility for all of them in a cost effective way.

At first we tried to deploy to various Enterprise Service Bus products, but we had serious problems with the rigidity and complexity the ESB approach required. We could not easily scale ESBs to handle tens of thousands of our application instances without some serious work and cost.

We needed to deploy existing traditional war files and micro-service jar files with as little change of code as possible. We needed to scale instances of these applications in the order of tens of thousands. We needed to collect all kinds of statistics and metrics from each and every one of those application instances. We needed a centralized, panoramic dashboard system on which to inspect, deploy, start and stop any or all of our application instances.

At first we thought well, why not just launch each micro-service as a standalone process using java –jar xyz.jar? That approach will work if you are writing a prototype. Once you need to deploy multiple instances to production, you will immediately encounter serious pain in terms of scaling, metrics collection, availability, reliability and location transparency. So we looked at using Docker, but the issues did not go away. We discovered we could indeed meet our non-functional requirements such as scaling, metrics collection, availability, and reliability and location transparency if we combined Docker, Mesos, Consul, Zookeeper, Marathon, NGINX and the JVM as below:

 

Suddenly the complexity just went right through the roof.

We were now looking at learning and mastering various technologies. We would need to hire engineers knowledgeable enough in most of these technologies, thus introducing a great deal of risk should any of them choose to leave for the next job. Plus, we could not find a way to increase the application density within each Docker instance.

So we turned to an OSGI container. We discovered that we needed to rewrite our applications to produce OSGI bundles and features and some other OSGI what not. We also discovered that OSGI platforms are notorious for class path idiosyncrasies, which really distracted our developers from delivering business value and lowered their productivity as they had to regularly contend with platform issues. In other words, the OSGI route required a steep learning curve and it would also lock us in into a particular technology.

So we looked at using Kubernetes and Docker, as follows:

 

A bit better, but still too complicated. This stack required learning and mastering various technologies. We would need to hire engineers knowledgeable enough in most of these technologies, thus introducing a great deal of risk should any of them choose to leave for the next job. We still could not find a way to increase the application density within each Docker instance even when using Kubernetes.

So we thought well, how hard can it be to craft our own micro-service container? So was born BeyondJ.

And here is what we came up with:

 

 

With this architecture, we could launch our application using any mechanism we prefer, including Unix style service launch, command line java –jar beyondj.jar, and bin/beyondj-service.sh.

We could also install BeyondJ on multiple machines and cluster the container by simply configuring Hazelcast discovery http://docs.hazelcast.org/docs/2.4/manual/html/ch12s02.html.

We built clustering right into the BeyondJ platform. We opted to have no master selection, as each instance of BeyondJ is created equal to any other. No split brain problems. No centralized management. No bottlenecks. Each instance will publish each change to the Hazelcast cluster, and all other instances will download and update their configuration and cache of services. That way each instance of BeyondJ on each machine will have the same configuration as any other, and well as be able to launch any new applications discovered from the cluster configuration. So when we need to launch a new application into the cluster, we simply upload a jar or zip artefact to a single instance of BeyondJ. That instance will publish the artefact including its configuration to the cluster. All other cluster members will download the artefact and its configuration and launch instances of the application in their own JVMs. Once each instance finishes the launch, it publishes the URL of the local application to the cluster. In turn, all other instances will update their cache of services to include the newly launched application instance (across the cluster). The end result is that each instance will have the same number and version of known applications, and a cache of addresses via which those applications can be reached. Each instance of BeyondJ will then load balance across the cluster for each application.

This ensures that a request can come to any instance of BeyondJ in the cluster, and can be served from any instance within the cluster, according to the load balancing strategy specified.

So what is the end result? A single jar, which can be installed on multiple machines providing near unlimited scaling, service discovery, load balancing, clustering, high availability, location transparency, liveliness, self-healing, metrics and dash boards. Wew!

Suppose we have written our application and now we want to deploy it to three machines to be accessed by 30 000 clients. To achieve high availability and liveliness we will deploy 10 000 instances per machine. We are going to launch a single JVM on each machine and deploy BeyondJ to each. Next, we will simply launch our browser and point it to an instance of BeyondJ installed on one of our three servers. We will then upload a jar or zip file of our application via the browser interface. We have the option to also upload an XML configuration file which would specify how many instances we need launched on each machine within our cluster (in this case 10 000 on each machine). We also have the option to specify the number of instances, among other options, within the browser interface. We will click deploy and voila, that's it. We will wait a few moments, and BeyondJ will launch for us 10 000 instances of our shiny micro service on each of our servers. Immediately we will start getting reports and feedback on all 30 000 instances.

The number of our application instances is only limited by the amount of resources (memory, threads) available on each machine.

We now have crafted a simple way to deploy and manage any number of application instances. We have dashboards feeding us metrics and reports and alerts. We have increased the application density within each JVM quite significantly.

In the BeyondJ Self Healing article I talk about how we use Akka for self- healing. What this means in that in BeyondJ, an application instance is always being watched by a supervisor to ensure that it is lively and available. When it crashes, the supervisor will immediately terminate it and launch a replacement instance, using the last known configuration.

 

http://beyondj.com

 

Notes:

There are other interesting use cases we would like to explore in the future for BeyondJ. These include performance analysis of deployed applications and security vulnerability analysis of deployed applications such as SQL injection detection.