BeyondJ Self Healing with Akka
Akka (http://akka.io) is a library from Lightbend that makes
it much easier to write concurrent Java applications. Akka uses a construct known
as the Actor Model to pass messages within an application. An actor is a
lightweight entity with a message box that has its own life cycle. It can be started and stopped.
An actor can send messages to other actors and can receive messages from other
actors. BeyondJ uses the Akka library to manage the life cycle of applications
deployed within it.
Each application bundle that is deployed in BeyondJ is immediately assigned an actor supervisor by the core launcher. On launch, the supervisor is informed of the bundle artefact location and configuration. The supervisor will inspect the configuration and determine what type of container to launch for the bundle and how many instances to launch. It will then copy the bundle artefact to a known deployment location. The supervisor will create a child actor for each application instance and proceed to instruct the child actor to launch a single application instance. The child actor (hence known as the bundle actor) is responsible for managing that application instance, and the supervisor is responsible for managing all the bundle actors that it launches.
Periodically, the bundle actor will query the application instance to determine its health. If the bundle actor determines that the application instance has crashed, it will do clean up of any known bundle resources and immediately start another instance of the application. All of this is of course configurable and can be enabled or disabled.
In the case of the Java Main Process launcher and Java Script Process Launcher, the bundle actor will start a child Java process for the application instance. In the case of Tomcat and Jetty container types, the bundle actor will start a new thread and launch the web app container in that thread. The application instance will then be deployed to the web app container. Note also that each container is loaded by its own class loader thus providing a degree of separation between containers. So there are two constructs applied to isolate the application instance: each is loaded into its own container, and each is loaded by a separate class loader. Jars and file system resources are however shared by all application types. What this means is that each application artefact is copied to the deploy location, and its resources are then shared by all application instances of that type and name.
Should an instance of the application crash, the supervisor will simply start another one to replace it. BeyondJ's goals are to ensure a high degree of scaling, a high degree of availability and a high degree of resilience.
In the case of web applications, the bundle actor maintains a health check url which it pings every so often to determine if the managed instance is still alive. In the case of Java processes, the bundle actor keeps track of the process id of the managed instance. The bundle actor will periodically query the operating system using that process id and will make a determination whether the process is alive or not.
One thing that proved particularly tricky is finding a cross platform way to detect when a java process had crashed. The Java API is awfully lacking in that regard and we ended up using the ineffable Sigar library.