Previously, Waking Up managed Kubernetes clusters using Rancher on AWS. Although Rancher facilitated multi-cluster Kubernetes management, the Waking Up team lacked the access and control needed. Additionally, scalability issues arose as the nodes provisioned by Rancher were inadequate and prone to quick failures, leading to infrastructure deficiencies.
To address these problems, a migration to a platform that allowed greater infrastructure control was imperative. This migration needed to be executed swiftly to eliminate dependency on Rancher and modernize the architecture.
For the development team, adopting a DevOps culture for their workflow was important. This would enable the release of minimal code changes to maximize user quality. Transforming the CI/CD pipeline and simplifying DevOps practices would facilitate a faster deployment cycle.
Waking Up’s primary objective was to ensure users could access daily lessons or meditations. To achieve this, a highly efficient, scalable, and secure solution was required.
In collaboration with the Waking Up team, we migrated microservices to the Amazon Elastic Kubernetes Service (Amazon EKS) using infrastructure as code with Terraform. The strategy involved segregating the infrastructure into different accounts within the organization to isolate the development environment from the production environment. For provisioning Amazon Elastic Compute Cloud (Amazon EC2) nodes, we chose reserved instances based on AWS’s Graviton (ARM64) processors, significantly reducing instance costs while optimizing performance. The Waking Up team now has control to implement and manage applications on Kubernetes. Scalability has also improved since the migration to Amazon EKS.
Waking Up enhanced reliability by implementing Amazon CloudFront CDN for low-latency delivery of its website. CloudFront distribution is configured with an origin access identity to ensure website access only through CloudFront, not directly from S3.
Security controls were implemented using AWS WAF (Web Application Firewall) and AWS CloudTrail to protect against potentially disruptive or resource-consuming web applications. This enabled user access tracking and detection of unusual behavior. DataDog Cloud SIEM was implemented to provide visibility and generate alerts for specific CloudTrail events associated with AWS entities (IAM).
Furthermore, a comprehensive infrastructure monitoring, logging, and alerting stack was implemented using Datadog. This empowered the development team to measure application performance and make informed decisions based on metrics.
- Enhanced infrastructure scalability: Successfully handled high-demand events with automatic infrastructure responses.
- Fully automated CI/CD process using GitHub Actions.
- Automated database migrations using Helm on each deployment.
- Business logic instrumentation.
- Early alerting for events and application performance metrics.
- Secure VPN access to private resources.