Microservices are regarded as one of the most critical ways for companies to remain competitive — the panacea in the modernization of software architectures. In fact, a recent study we, LeanIX, conducted revealed that more than 70 percent of companies want to use microservices by spring 2018.
But as with any other major transformation, the decision to modernize software is not always an easy one, and the shift from monolithic architectures comes with a number of challenges.
As a SaaS platform provider for Enterprise Architecture Management (EAM), LeanIX regularly navigates the challenges that come with application lifecycle management for our customers. We come at it with some close personal experience – following our own modernization from a monolithic architecture to microservices after only four years.
The limitations of monolithic architectures
Since our founding in 2012, our platform has gained considerable traction and our customer base has expanded exponentially. Our growth demanded numerous new requirements, such as single-sign-on authentication, the introduction of new workflows, and a more flexible data model. The technical foundation had to be able to cope with constant development and quick growth.
We found ourselves faced with the same challenges we regularly hear from our global customers. The original platform was built on a conventional LAMP (Linux, Apache, MySQL and PHP) stack, in a shared code basis, which became an issue with our growing number of developers. Testing and deployment became more complex, slowing down our ability to release new features and innovations – a core benefit to our customers. This was compounded by technical risks, as support and security updates were no longer available for all components of the existing software versions.
It became clear that our business could no longer be supported with a monolith – it was time to make a move. We evaluated a few different architecture options and ultimately we decided microservices was the right choice for us.
Requirements for success
Whether a startup or large enterprise, the reasons for adopting microservices remain largely consistent. An IT application frequently starts with a small range of functions and grows as requirements increase, quickly evolving into a monolith that prevents successful market penetration and inhibits scale. Luckily, there are preventative measures at the organizational-, development-, and process-level that can effectively combat this outcome.
- Agile development. We initially trained the entire team in Scrum, and quickly gained more than 12 months’ experience in working with the existing code. But it was not enough to just learn about the agile processes; we also had to gain specialist knowledge about the existing application. We now work according to the GitFlow principle, developing new features in branch workflows. The changes are then transferred into the main development strand by means of pull requests using the dual control principle, which creates new releases of the software. The approach promotes pair programming and allows for improved knowledge sharing across the development team.
- API-First. Every new service that is developed has a REST API, which has been documented by means of a Swagger definition — a description language for APIs, from which both documentation as well as an SDK can be generated automatically. The advantage for developers is that the documentation that is generated is interactive, allowing the API to be called up directly in the browser, with suitable data access. This gives us more flexibility, saves development time and facilitates rapid prototyping. SDKs can be generated in all available programming languages thanks to Swagger, making it easier to use the APIs from other services and for our customers to use them.
- Containerization. By executing services in a standardized Docker runtime environment in the form of containers, we could commission new software releases without requiring downtime, facilitating what is known as green-blue deployment. During a new release, containers are provided in the new version, and then run in parallel to the old version. As soon as the new version has started up, the users are directed to the new version, enabling a smooth (often even undetected) transition.
- Automation. We have set up a heavily automated tool chain starting with development, proceeding through tests and finishing with deployment. Of central importance here is the use of continuous integration (CI) and continuous deployment (CD) servers and the domain-specific language, Ansible, which allows for the automation of both infrastructure tasks and the provisioning of development computers and servers. As soon as changes are available in the central source code repository (GitHub), the CI/CD servers create new versions of the software, which then activates end-to-end testing. During these tests, application cases are automated in the same way they would be executed in the browser, allowing us to test the application from the user’s perspective.
- Central event hub. Establishing a publish-and-subscribe architecture can help companies avoid numerous point-to-point connections between services. The services in question produce events, which are distributed to a central event hub. Not only can the internal services subscribe to the event hub, the customers can also use these events externally by means of Webhooks.
Navigating the process
With the prerequisites in place, we began the transformation, a process that would take 36 months. During the first 18 months, we implemented new functions such as survey workflows by means of new services, and operated them in addition to the existing core. More and more functions such as user management, export to graphics or provision of images were cut out of the monolithic core application, allowing us to gradually reduce the core to its main functions. Over the next twelve months, the central functions were re-implemented in the new microservice architecture. In addition to one-to-one migration of the existing functions, central requirements such as a flexible data model were also deployed. It took six months to then migrate 80 existing customers, as new customers were concurrently added to the new “Pathfinder” platform. The process required navigation of several key challenges.
Performing while transforming. Throughout the migration process, our customer base continued to grow, requiring the balance of both migrating existing customers and adding new customers. The increased requirements led to an expansion of the originally planned scope of functions, and we had to ensure that aspects of the old version remained available to individual customers.
The great migration. In the case of an application that produces data and has already been in use by customers for more than five years, numerous configurations made replicating with test data very complicated and time intensive. We had already decided on automated data migration, without realizing this risk until the process was well underway.
The paradigm shift. A defining feature of the original architecture was the fact that every query was processed via the browser in a separate request, typical of many web applications based on script languages such as PHP. With the change to Java, the executing model is now fundamentally different. As a result, queries run up against a constantly-running process, which can create bottlenecks in the memory and the risk that individual requests will paralyze the entire application.
We chose to implement services gradually. Over a period of 30 months, containerization allowed us to operate old and new services in parallel with Docker. Throughout the development of the new main application, migration of the data was always automated in parallel. As a result, it was possible to test the customers’ entire working area in the old version in parallel with the new version, without interfering with live operation of the existing solution for the users.
To ensure the smoothest transition possible, we began with new, non-critical customer requirements. This gave the team additional experience when it came time to port high-priority requirements and ensured a seamless implementation.
Close coordination between our engineering and customer success teams was critical to the migration. Working together enabled our team to push the development of the application forward while maintaining streamlined customer communication and planning. This process helped our team ensure that customers were migrated once all necessary features were available on the new application. Migrating from Scrum sprints to Kanban during the six-month migration process allowed us to more easily identify and react to unforeseen challenges. For ease of customer use, while we switched to a modern single-page app, we kept many aspects of the user experience consistent with the previous platform, allowing for faster loading times without the need to completely re-train users.
The path forward
Today, over 100 customers worldwide, including adidas, Zalando and Merck utilize the updated LeanIX platform, with migration completed in late 2017. Thanks to our now flexible architecture, further features have already been deployed to customers and overall performance has improved. Optimizing the release and deployment process was a major organizational challenge, but we learned that if approached the right way, architecture transformation can set us up for short-term growth and long-term success.
Author: André Christ, Co-CEO of LeanIX