Auto Scalability = PEACE!!!

5 min readAug 3, 2022

Scalability is a well known concept; and it has several benefits. Though designing a Scalable Solution brings some of the Architectural complexities; and few more get added when you want scalability to happen automatically.

Yet if configured properly Auto Scalability can give you a peaceful life — Knowing everything (Cost and Performance) is being taken care off.

Let’s discuss all these in this article.

Scalability?

Adding (or removing) additional resources as per the load on the system. Scalability can be categorised as Horizontal Scaling or Vertical Scaling.

Horizontal scaling deals with adding (or removing) other nodes; Vertical scaling deals with adding (or removing) resources on existing node.

In this article we will talk about Horizontal Scalability only.

Traditional Approach of achieving Scalability

Traditionally scaling up/down would require constant monitoring and alerting.

Scaling up would require System Administrators to manually get more nodes, configure those, deploy applications onto them and add those nodes to the cluster.

Depending on the underline architecture — this can take from hours to days.

Scaling down would also require human efforts, though it is much faster than scaling up.

Disadvantages of Traditional Approach

Overall process is extremely slow as there are too many Human touch points.
This approach may lead to system’s subpar performance (in the event of overloading) or may waste resources (in the event of less traffic) / until the System Administrator(s) get involved.
This approach may impact Financial Bottom Line (Human Resources and Resource Wastage) and may impact Top Line (Sub-Optimal customer experience).

Auto Scalability

There are several tools/systems present that can monitor the current/upcoming load on the system, and can automatically add or remove the nodes from the cluster.

These systems may require initial investments (Capital, Human Resource) but leads to greater returns — if configured and used properly.

Prerequisite

While not a Hard Requirement — Auto Scalable Systems works best for Distributed Systems.

This is for the fact that not all Modules of the System may have equal amount of scalability requirements.

Some of the High Performing Modules may have less requirement of Scaling Up; while the slow performing modules may require more.

You may want to refer to my previous article on Architecting Distributed Systems :

Distributed Computing — Approaches

Distributed Computing is extremely powerful — but sometimes architects get it wrong. If not implemented properly —…

techandlife-in.medium.com

Criteria for Auto-Scaling Up/Down

System can be configured to react to both Current (internal to nodes) — and Upcoming (external to Nodes) events.

Decision based on Current Processing Stats

System can be configured to Auto Scale (Up/Down) based on the Memory , CPU etc. So we can configure the system to add more Nodes — if the Memory Usage touches X%; and remove Nodes if it touches Y%. Same can be done for other resources such as CPU as well.

Decision based on Upcoming Load

Systems can also be configured to Scale Up/Down based on Upcoming Events such as number of tasks in an Event Bus.

If number of events in the bus are more than X — add more Nodes; and if the events in the bus are Less than Y — remove Nodes.

Which Criteria (Current Load or Upcoming Load) to Use?

This is where the Architect has to be extremely careful. There are Pros and Cons of both the approaches (As I always say — Every Coin has Two faces / हर सिक्के के दो पहलु होते हैं.).

Decision based on Upcoming Load

Being aware of the Upcoming Events gives System time to prepare itself before the processing begins. This ensures that System gives higher performance in the wake of more Load coming towards.

At the same time, when the load is less or the events are less, System may not perform that well. System if not configured properly may removed the nodes that are already processing previous event. This can lead to Serious Issues.

(+) Making Decision based on Upcoming Load ensures that System performance does not degrade in the event of Heavy Load.
(-) Making Decision based on Upcoming Load can lead to killing of the Nodes that are in the middle of processing previous Event.

Decision based on Current Load

Making Scalability decision based on the current load may take time to respond to the Heavy Load. In this case — System will wait to reach the Threshold for resource Usage before spinning up new Node. And the New node may not contribute to current processing. So new node may sit idle as well.

At the same time, if configured properly, this approach ensures that System doesn’t kill the currently processing node; in the event when the system load is down.

(-) Making Decision based on Current Load can lead to degraded performance and wasting of resources in the event of Heavy Load.
(+) Making Decision based on Current Load ensures that no active node is Killed in the event of less load on the system.

Recommendation

Auto-Scalability is THE requirement of Business; as I mentioned it can have implications on Top and Bottom Line of a Company.

While Architects can choose any tool/systems to achieve Auto-Scalability; they have to be extremely careful in choosing how they want the system to Auto-Scale.

If not configured properly, it can lead to Serious Issues (in the event of Killing of Active Nodes) or may not give the required advantage (in the event of a Heavily Loaded Node along with Idle Node).

My recommendation is to configure the system to look into upcoming events for Scaling Up and Current Load on the Node for Scaling Down.

Choosing correct Configs for Auto-Scaling requires Serious Analysis in itself; Architects should look into multiple factor (Average/Max/Min Load on the System, Average/Max/Min Traffic Volume etc.).

This approach of mixing both kind of decision making (if possible) can get the Solution best of both worlds.

All the best in setting up your next Auto-Scalable System.