Overview
If you’re looking to modernize your applications, containers are a powerful tool that you can’t afford to ignore. Containers are portable, lightweight, and allow for easy deployment, scaling, and management of applications. They also provide more robust operability and engineering agility. However, as you adopt containerized architectures, you may face challenges in managing the numerous unique services that come with them. That’s where container orchestration comes in.
Container orchestration is the process of automating the deployment, scaling, and management of containerized workloads and services. This technology streamlines the operational effort required to run these workloads, making it easier for developers and operations teams to focus on their core work.
Prescriptive Guidance
To address the challenges of implementing container orchestration, this solution provides a starting point closer to the finish line. We start by leveraging the power of Amazon Web Services (AWS) and their managed Kubernetes service, Amazon Elastic Kubernetes (EKS). By using EKS organizations are relieved of the responsibility of managing the EKS management plane itself. We then focus on the additional layers of tools and configuration required to deliver enterprise grade applications.
Before diving into the container orchestration solution, it is important to acknowledge the ever-growing importance of containerization in modern IT operations. With the increasing demand for speed, agility, and efficiency in software delivery, containers have become an essential tool for organizations of all sizes. However, managing containers at scale can be a complex and time-consuming process. In this solution, we will:
- Deploy an EKS Cluster on a VPC, an RDS Database, ECR Repositories, and CodePipeline piplines to build the container images
- Enable Kubernetes cluster add-ons
- Define cluster access to Kubernetes based on IAM roles
- Deploy service-accounts to give containers IAM access
- Deploy applications with ArgoCD demonstrating the gitops approach
Kubernetes is a complicated and customizable container orchestration toolset, and each organization should develop a bespoke strategy that works well within their environment and process. The first step in this process is to determine the use cases that will be deployed on this cluster, as this will determine many of the other requirements for the cluster architecture and design. Some questions that should be asked include:
- What types of services will run? (E.g., HTTP services, TCP services, background or batch jobs?)
- Do these services have special security requirements?
- Do these workloads require persistent data stored in the cluster? (E.g., databases, file storage)
Tools and Components:
Once you have this understanding of your use case, we use the following tools and components to be able to implement our containerization approach:
Karpenter
Automatically launches just the right compute resources to handle your cluster’s applications. It is designed to let you take full advantage of the cloud with fast and simple compute provisioning for Kubernetes clusters. Karpenter responds quickly and automatically to changes in application load, scheduling, and resource requirements, placing new workloads onto a variety of available compute resource capacity. Karpenter lowers cluster compute costs by looking for opportunities to remove under-utilized nodes, replace expensive nodes with cheaper alternatives, and consolidate workloads onto more efficient compute resources.
Horizontal Pod AutoScaling
Horizontal scaling means that the response to increased load is to deploy more Pods. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example: memory or CPU) to the Pods that are already running for the workload. If the load decreases, and the number of Pods is above the configured minimum, the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet, or other similar resource) to scale back down. Horizontal pod autoscaling does not apply to objects that can’t be scaled (for example: a DaemonSet.) The HorizontalPodAutoscaler is implemented as a Kubernetes API resource and a controller. The resource determines the behavior of the controller. The horizontal pod autoscaling controller, running within the Kubernetes control plane, periodically adjusts the desired scale of its target (for example, a Deployment) to match observed metrics such as average CPU utilization, average memory utilization, or any other custom metric you specify.
GitOps
An operational framework that takes DevOps best practices used for application development such as version control, collaboration, compliance, and CI/CD tooling, and applies them to infrastructure automation.
ArgoCD
Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes. Application definitions, configurations, and environments are declarative and version controlled. Application deployment and lifecycle management are automated, auditable, and easy to understand.
Cluster
Cluster is a broad term in Kubernetes used to include a set of worker nodes and associated control plane resources. Large organizations will typically use multiple Kubernetes clusters with multiple node groups attached to each cluster in different capacities as a part of a larger Kubernetes platform.
We recommend deploying a cluster to match existing environmental and regional logical boundaries where possible as these boundaries usually already include a well-established blast radius. When this is not possible, we recommend a minimum of two clusters for production and non-production workloads and a minimum of two node groups per cluster for system and general-purpose workloads, both recommendations help ensure overall cluster health and scalability long term.
Control Plane
In a Kubernetes environment the control plane is a set of services in a cluster that expose APIs that allow creation and management of cluster resources. As this is a critical set of services Vertical Relevance recommends the use of a vendor supported Kubernetes distribution such as AWS Elastic Kubernetes Service (EKS) as this drastically lowers the operational and cost overhead required to run a Kubernetes cluster and ensures an SLA for these critical services.
Worker Nodes
Worker nodes are machines that are part of the Kubernetes cluster and whose resources are available for scheduling pods, deployments, daemon sets, etc. We recommend that worker nodes be managed through auto-scale groups and other automated tooling provided by AWS EKS Managed Node Pools.
Networking
Networking in Kubernetes can be a complicated subject due to the number of networking options available. AWS EKS supports a limited set of CNI drivers but we recommend using the AWS VPC CNI if it meets all requirements due to its simplicity and ease of use in an existing AWS based infrastructure workflow.
Scalability and Resiliency
AWS EKS manages scalability of the cluster control plane as a part of its managed service. Worker nodes should be deployed in an auto-scaling group with auto-scaling strategies in place. This means minimum and maximum instances defined and the availability zones in which they can be deployed, as well as applications that are tolerant to regular and automated instance destruction. Where technically possible application or user state should not be maintained on any auto-scaled instances. We also recommend the use of Karpenter to automatically manage the capacity of all node groups.
Managed and auto-scaling node groups also allow for much greater resiliency as defunct instances can automatically be replaced when necessary. This both eliminates time spent troubleshooting a broken node and allows the cluster to take actions to heal itself. Ephemeral nodes are also a great aid to patching and other operating system maintenance as existing instances can be replaced with new instances that include the latest software and OS versions.
Orchestrator
The Core Orchestration Solution is built with the foundational set of capabilities we recommend that allow you to run your containerized solution in AWS EKS. They provide the core capabilities to manage the compute for your cluster, the network connections in which your cluster and application(s) operate and the ability for your client applications/services to be able to access your solution.
The orchestrator is a platform designed to automate and manage the deployment, scaling, and operation of containerized applications. Containers are lightweight, isolated environments that package an application and its dependencies, allowing for consistent and efficient deployment across different computing environments. Orchestrator simplifies the complex task of managing large-scale container deployments by providing features such as:
- Automated Deployment: Orchestrator automates the process of deploying containers, ensuring that applications are launched consistently and reliably across various servers or nodes. Orchestrator facilitates the process of updating applications by gradually replacing old containers with new ones, minimizing downtime and ensuring a smooth transition.
- Scaling: It enables dynamic scaling of applications based on factors like traffic load. Containers can be quickly replicated or removed as needed to handle changes in demand.
- Load Balancing: Orchestrator distributes incoming network traffic across multiple instances of a containerized application, optimizing resource utilization and ensuring high availability.
- Resource Management: These tools allocate computing resources such as CPU, memory, and storage to containers as required, preventing resource contention and performance issues.
- Self-Healing: Orchestrator monitors the health of containers and their underlying infrastructure. If a container or node fails, the orchestrator automatically replaces or relocates the affected components to maintain application availability.
- Service Discovery: It provides mechanisms to dynamically discover and communicate with other services, making it easier for containers to interact with one another in a distributed environment.
- Configuration Management: Orchestrator manages application configuration and environment variables, ensuring consistency and ease of updates across different container instances.
- Security and Isolation: Orchestrator enforces isolation between containers and nodes, limiting the impact of security breaches or failures.
- Compatibility and Portability: Orchestrator enables developers to build applications in a standardized manner using containers, making it easier to run these applications on various cloud platforms or on-premises infrastructure.
Orchestrator plays a crucial role in modern software development and DevOps practices, as it enables organizations to efficiently manage and scale containerized applications in dynamic and complex computing environments.
These core capabilities implement the best approaches to ensure you orchestrate and operate an efficient, secure and scalable containerized solution.
Main Platform Components
- EKS Blueprints – A baseline solution from AWS to compose a complete EKS cluster that is fully bootstrapped with the operational software that is needed to deploy and operate workloads. With EKS Blueprints, we describe the configuration for the desired state of our EKS environment, such as the control plane, worker nodes, and Kubernetes AddOns, as Infrastructure as Code (IaC) and deploy it into an operating environment.
- AWS VPC – The VPC Supports the network infrastructure underlying the solution.
- AWS EKS Cluster – Supports the management plane of the Kubernetes clusters to enable deployment of workloads on to nodes managed by Kubernetes
- Default Blueprint NodeGroup – Establishes initial compute resources for operation of management workloads such as metric server and load balancer controllers
- AWS EKS Addons – AWS EKS Cluster AddOns are services that run inside of Kubernetes to extend its functionality and control over infrastructure described in Kubernetes manifests. In addition to a foundation of using EKS Blueprints we have curated a select list of AWS EKS AddOns to support the needs of the sample application hosted by this solutions. Additional AddOns should be considered based on the needs of the cluster you intend to run. The following are the minimum AddOns we recommend for the core of container orchestration:
- AwsLoadBalancerControllerAddOn – help manage Elastic Load Balancers for a Kubernetes cluster
- Deploys an Application Load Balancer when you specify a Kubernetes Ingress resource
- Deploys a Network Load Balancer when you specify a Kubernetes service type of LoadBalancer
- ExternalDnsAddOn – Allows integration of exposed Kubernetes services and Ingresses with DNS providers, in particular Amazon Route 53. The AddOn provides functionality to configure IAM policies and Kubernetes service accounts for Route 53 integration support based on AWS Tutorial for External DNS
- VpcCniAddOn – supports native VPC networking
- KubeProxyAddon – maintains network rules on each Amazon EC2 node. It enables network communication to your pods
- CoreDNSAddon – a flexible, extensible DNS server that serves as the Kubernetes cluster DNS
- KarpenterAddOn – provides a more efficient and cost-effective way to manage workloads by launching just the right compute resources to handle a cluster’s application
- AwsLoadBalancerControllerAddOn – help manage Elastic Load Balancers for a Kubernetes cluster
How it works
A solution that provides a solid baseline of capabilities on which to build our recommended container orchestration solution is EKS Blueprints. We don’t believe is re-inventing the wheel so we use this as the baseline to build upon. We then configure supported add ons to the baseline configuration of EKS Blueprints to provide the capabilities that we believe are critical to operate and orchestrate an efficient, secure and scalable containerized solution.
When the blueprint is deployed it will create a fully functional EKS cluster using Karpenter for efficient provisioning of worker nodes to meet the requirements of the application(s), with support for native VPC networking, integrated Route53 with the Kubernetes services ingress and egress and AWS load balancers, and metrics and log forwarding for observability.
Figure-01
Blueprint
This section of the blueprint uses makefile commands to build a fully functioning EKS cluster with all necessary container orchestration capabilities.
- CDK deployment – The EKS cluster with all add ons is created and deployed into the target AWS account using CDK. This will create the VPC, provision the EKS cluster, create the RDS database and Secrets Manager Secret for the database. In addition, the CDK deployment will set up and configure CodePipeline, CodeBuild, and Elastic Container Registry (ECR).
- Building the container images – The makefile includes commands to sync the code to CodeCommit, triggering CodePipeline and CodeBuild to build and push the container image into ECR.
Observability
Robust insight of the cluster’s live activity, performance trends, and operation history is vital to ensuring a robust experience and effectively managing resources. Applications and infrastructure should provide data on usage, performance, and log events to standardized analytics tools where possible.
Where appropriate, applications will provide HTTP endpoints for collecting standardized metrics (such as Prometheus formatted metrics) and supporting standard Kubernetes health checking. Application and process logs should use standardized formats (such as JSON) and include trace information to facilitate searching and trace correlation.
Great care should be taken to integrate any new monitoring tools into any existing monitoring infrastructure and process. Kubernetes has a different operating model overall and requires different troubleshooting skills compared to many infrastructure strategies and integration into existing monitoring tools can help ease this transition.
Components
- AWS EKS Addons:
- MetricsServerAddOn – Deploys Kubernetes Metric Server a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines
- AwsForFluentBitAddOn – log processor and forwarder which allows you to collect data like metrics and logs from different sources, enrich them with filters and send them to multiple destinations
- ContainerInsightsAddon – Container Insights uses the AWS Distro for Open Telemetry (ADOT) collector to aggregate and summarize performance metrics from containerized applications and microservices. The aggregated metrics are viewable in CloudWatch dashboards and the CloudWatch console.
How it works
Deploying this blueprint through the CDK automatically configures the Metrics Server, Fluent Bit, and Container Insights Addon. The EKS blueprints framework provides a modular approach to selecting addons and offers the ability to automatically configure the addons with best practices a. The configuration for the Metrics Server is specified in the main CDK application’s blueprint.ts file on line 76. The configuration for Fluent Bit is in blueprint.ts on lines 94-112. The Container Insights addon is specified at line 113 of the blueprint.ts file as well. When the CDK is deployed, the cluster is created and the addons are automatically installed. These addons are essentially pre-configured helm charts which install Kubernetes resources onto the cluster. In the case of Fluent Bit, there is a daemonset that is run on each node which collects logs and forwards them to CloudWatch.
The overall solution is shown in the diagram below:
Figure-02
Blueprint
This section of the blueprint uses makefile commands to deploy the CDK code, setting up the cluster and addons for observability.
- The cdk deploy command is run as part of the default make command to deploy the cluster and addons.
Deployment
Once you have the core containerization solution in place, you deliver the planned business value by deploying containerized applications onto it.
For this we need an appropriate deployment strategy. GitOps as a strategy that originally came out of Weaveworks in about 2017 and has spread throughout the cloud native ecosystem rapidly. GitOps describes a method of managing infrastructure through declarative IaC configuration stored in a git repository and does not require a set of prescribed tools and can be adapted to many different environments.
Vertical Relevance recommends using a GitOps based strategy for deploying all cluster resources where technically feasible. At a minimum all resources should be defined using IaC to ensure repeatable and resilient lifecycle management of cluster resources.
A GitOps based strategy is extremely flexible and it is important to tailor a strategy that meets organizational requirements and works with existing workflows where it can. We recommend using tools such as ArgoCD or FluxCD to continually manage cluster workloads. Using an automated tool such as these will ensure that configuration is applied on a continual basis which ensures conformance to the desired state of the infrastructure.
Clients request the app by the defined hostname stored in Route53 records that point to Application Load Balancers (ALBs). ALBs store rules that sort incoming requests based on rules that associate hostname services in Kubernetes that are wrapped into AWS Target Groups. Target group healthchecks along with Kubernetes readiness and liveness probes ensure that requests are served by pods operating within the optimal metrics defined by the horizontal pod autoscaler. As the pods are stressed or relaxed outside of the defined bands, Kubernetes provisions or removes pods by safely draining or waiting for them to be ready for traffic after intitial startup procedure completes.
Components
- ArgoCD – compares desired configuration in the Git repo with the actual state in the Kubernetes cluster and syncs what is defined in the Git repo to the cluster, overriding any manual updates —guaranteeing that the Git repo remains the single source of truth
- Karpenter – simplifies Kubernetes infrastructure with the right nodes at the right time. It launches just the right compute resources to handle your cluster’s applications. It is designed to let you take full advantage of the cloud with fast and simple compute provisioning for Kubernetes clusters
- HELM – a package manager for Kubernetes that makes it easy to take applications and services that are either highly repeatable or used in multiple scenarios and deploy them to a typical Kubernetes cluster. Application charts are divided by the kind of object they manage. There are 4 main objects in our charts:
- Ingress – Defines the ALB and rules to pass traffic to the service
- Service – An abstract way to expose an application running on a set of Kubernetes Pods as a network service. With Kubernetes you don’t need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes Pods are given their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them
- Deployment – Define Kubernetes Pod specs such as container port resource requirements along with node selectors health and readiness probe paths. These configuration will require fine tuning on a per app basis however our apps start with a basic example
- Horizontal Pod Autoscaler – the replicaset is the section of the deployment definition which hard codes the number of pods to be deployed. This object allow Kubernetes to manage the number of pods to stay within the optimal operation metrics defined. This is an area that can require some fine tuning based on your application and the demand clients place on it
How it works
Deploying this blueprint implements the ArgoCD deployment of the containerized solution: a Spring application for API services and a React front end, container within the overall polling application – the containerized solution. Argo CD will orchestrate the deployment of the React front end and the API services using two separate helm charts, and uses the third helm chart to package and deploy the application charts as an application of applications. This pattern allows us to declaratively specify one Argo CD application that contains the other loosley coupled applications.
When executing the ArgoCD deployment, the following steps are performed in the blueprint:
- Verify login to ArgoCD on the cluster
- Create the public and api namespaces on the cluster
- The public namespace will be used for the React front end
- The api namespace will be used for the Spring API services
- Configure a deny all network policy for the two namespaces
- Configure the git repository to deploy the applications from
- Create and deploy the overall polling application
- This coordinates and deploys the React front end application and the Spring API services
- Synchronizes the application configurations and ensures they are operating succesfully and are healthy
The overall solution is shown in the diagram below:
Figure-03
Blueprint
This section of the blueprint uses makefile commands to deploy the spring frontend and backend applications onto the EKS cluster using Helm charts and ArgoCD.
- Update-values – Run commands to update the values files in-place (utilizing CDK outputs). Once values have been updated, commit and push them to your private repo
- ArgoCD proxy – Set up to allow the deployment of the application front and back ends using ArgoCD and the deployment of the Kubernetes dashboard
- K8s dashboard – Configure the Kubernetes dashboard to monitor the configuration and use of the cluster
- Spring apps – Deploys the Front end and back end applications onto the cluster
Benefits
- Scalability: With container orchestration, you can easily scale your applications up or down, depending on demand. This ensures that you’re able to provide a consistent user experience, regardless of how many users are accessing your applications at any given time.
- Automation: Container orchestration automates many of the tasks and best practice involved in managing containerized workloads, reducing the need for manual configuration and intervention. This frees up your teams to focus on more strategic initiatives.
- Resilience: Container orchestration provides resilience by automatically managing the availability of your applications. If a container fails, it can automatically spin up a new one, ensuring that your application remains available to users.
- Lower Total Cost of Ownership: EKS relieves organizations of the responsibility of managing the EKS management plane, reducing complexity in building out a foundation for container orchestration.
End Result
In conclusion, the use of container orchestration with EKS and additional tools provide a comprehensive solution for managing containerized workloads and services. HELM simplifies the deployment of repeatable or multi-use applications to a Kubernetes cluster. ArgoCD ensures that the desired configuration in the Git repo is always synced with the actual state in the Kubernetes cluster, making the Git repo the single source of truth. Karpenter simplifies Kubernetes infrastructure by launching the right compute resources at the right time, allowing for fast and simple compute provisioning for Kubernetes clusters. Other AddOns enable manifest files to manage other required configurations such as DNS records and load balancing. These tools work together to provide a streamlined and efficient way to manage containers at scale, enabling organizations to improve their engineering agility and operational efficiency.