Skip to main content

Command Palette

Search for a command to run...

Scaling Amazon EC2

Published
5 min read

What is Scalability?

Scalability entails starting with only the resources you require and building your architecture to scale out or in automatically. As a consequence, you only pay for the resources that you utilize. You don't have to be concerned about a shortage of processing power to satisfy your computer requirements.

AWS provides an Amazon EC2 Auto Scaling service for automatically scaling the process.

Amazon EC2 Auto Scaling:

Amazon EC2 Auto Scaling is a service that lets you automatically modify the capacity of your Amazon Elastic Compute Cloud (EC2) instances based on application demand. It assists you in maintaining application availability, optimizing resource utilization, and reducing expenses.

You may build scaling policies that automatically add or remove EC2 instances from your application's fleet depending on predetermined circumstances using EC2 Auto Scaling. These circumstances can be determined by data such as CPU utilization, network traffic, or custom metrics. Scaling policies may be specified to raise or reduce the number of instances, ensuring that your application can manage variable loads.

Within Amazon EC2 Auto Scaling, you can use two approaches:

  1. Dynamic Scaling - Dynamic scaling in EC2 Auto Scaling refers to the capability to automatic adjustment instance capacity based on workload demand. It ensures that your application has the appropriate capacity to handle varying levels of traffic or load without requiring manual intervention.

  2. Predictive Scaling - Predictive scaling in EC2 Auto Scaling is the use of machine learning algorithms to forecast demand and proactively adjust instance capacity. It analyzes past usage patterns, seasonal trends, and other factors to predict the expected demand for your application. By leveraging these predictions, predictive scaling automatically adds or removes instances ahead of time, ensuring that your application can handle the anticipated workload without any manual intervention.

Note: To scale faster you use dynamic scaling and predictive scaling together.

For example:

In cloud computing, power is a programmatic resource, so you take a more flexible approach to the issue of scaling.

By adding Amazon EC2 Auto Scaling to an application, you can add new instances to the application when necessary and terminate them when no longer needed. Suppose when configuring the size of your Auto Scaling group, the minimum number of Amazon EC2 instances is set to one. This means that at all times, at all times there must be at least one Amazon EC2 instance running.

Minimum Capacity - The number of Amazon EC2 instances that launch immediately after you have created the Auto Scaling group.

Desired Capacity - Set at two Amazon EC2 instances even though your application needs a minimum & a single Amazon EC2 instance to run.

Maximum Capacity - For example. you might configure the Auto Scaling group to scale out in response to increased, demand, but only to a maximum of four Amazon EC2 instances.

Note: You pay only for the instances you use when you use them. Therefore cost-effective.

Directing traffic with Elastic Load Balancer:

AWS Elastic Load Balancer (ELB)

AWS Elastic Load Balancer (ELB) is a service that automatically distributes incoming application traffic over many EC2 instances, containers, or IP addresses. It improves your applications' availability, fault tolerance, and scalability.

ELB functions as a "load balancer" for your application, spreading traffic evenly across numerous backend instances. This guarantees that no single instance becomes overburdened with traffic and aids in the prevention of performance bottlenecks.

For example:

In a scenario where you have a highly popular e-commerce website hosted on multiple EC2 instances, AWS Elastic Load Balancer ensures high availability and evenly distributes incoming traffic. As traffic spikes during peak hours, the load balancer automatically scales up the number of instances in an Auto Scaling group. It performs health checks to remove unhealthy instances and seamlessly directs traffic to healthy ones. With SSL termination and advanced features like content-based routing, the load balancer optimizes performance and provides a seamless user experience. Elastic Load Balancer's integration with Auto Scaling ensures your website can handle fluctuating demand, guaranteeing reliability and scalability.

Lower-Demand Period:

A low demand period for an Elastic Load Balancer (ELB) occurs when the incoming traffic to your application is much lower than usual. During certain times, the ELB may lower the number of Auto Scaling group instances to optimize resource utilization and reduce expenses.

For example :

In a coffee shop scenario, a low demand period for Elastic Load Balancer (ELB) would be during off-peak hours when customer traffic is minimal. For instance, during weekdays, the coffee shop may experience a surge in customers during morning and lunch hours, but fewer visitors in the late afternoon. During these low-demand periods, the ELB can automatically scale down the number of active registers or order-taking terminals. This optimizes resource utilization, reduces costs, and ensures that only the necessary terminals are operational to handle the lower customer volume. By dynamically adjusting the capacity, the ELB helps the coffee shop efficiently manage its resources during quieter periods.

High-Demand Period:

In the context of an Elastic Load Balancer (ELB) for a coffee shop, a low-demand period refers to a time when there are fewer customers or reduced footfall compared to peak hours.

For example:

During weekdays, the coffee shop experiences high customer traffic in the mornings and during lunchtime. However, in the late afternoon or evening, the number of customers decreases. During these low-demand periods, the ELB can automatically scale down the number of active registers or order-taking terminals. This helps optimize resource allocation, minimize costs, and ensure that only the necessary terminals are operational to handle the lower volume of customer orders, improving efficiency during quieter periods.

More from this blog

Introduction to technology

16 posts