Azure Container Instances (ACI) are a very nice example of what organizations will never be able to do on-premises: scale high and pay peanuts. Let me tell you a real-world story where ACI came to the rescue.
Microsoft describes ACI like this: fast and easy containers and that is really what they are. I have recently been involved in analyzing an application suffering from scalability issues, mostly around memory contention. While scaling out and up the on-premises infrastructure helped, it was only a matter of buying time. It is also important to note that, in Traditional IT, we usually scale up and out but rarely down…meaning that even if you only need more power at some point in time (peak), you will usually pay the whole infrastructure whether it is sleeping or not. The latter statement is less true with containerization, although you still need to buy hardware and computing power is not unlimited, and that’s where Public Cloud providers make the difference. To come back to the application, refactoring the code to make a better use of the memory was not possible because of functional constraints, hence the reason at some point in time (multiple users etc.), memory always became an issue, leading to swapping and to some unexpected application crashes. In a nutshell, the application is merely analyzing large files and performs many different checks to validate the contents, and is written in .NET.
Therefore, I first thought to use Azure Databricks, since it is designed to handle large volumes in a very efficient way. The only problem is that it is a complete shift of technology (Python, R, etc.), meaning rewriting almost everything. Then I thought that ACI could be handy and that assumption was right. From hours of execution and instability of on-premises systems, we fall back to minutes or even seconds in ACI. Besides working with ACI, we also shifted from .NET to .NET Core which is best suited for containerization, and transitioning from .NET to .NET Core is an easier move. Enough talk, let’s see in practice what it looks like through a very simplified high level architecture diagram:
In short, the large files are uploaded to blob storage which triggers the orchestrator client (Durable Functions) and from there on, activities trigger ACI instances on the fly and clean them after use. The beauty of ACI is that you can spin them up dynamically, pass them parameters through environment variables and let them reach out to orchestrators via queuing or other mechanisms. ACIs are particularly suited for high resource consumption needs for a short amount of time. They have a built-in retry policy support that kicks off in case of failure and they do not require any knowledge of orchestrators such as Kurbernetes. To package them, a mere dockerfile is enough. But what I really like is that a single container instance may benefit from a lot of computing power and that’s exactly what we need for job-like operations.
The service is charged by second of execution and that’s why you pay peanuts providing you have short-lived executions. If we consider to allocate 16GB and 2 VCPU per container group and envision 10 executions of 5 minutes per day, the total bill will be around 4 euros a month…on top of which you have to add the cost of a container registry which varies roughly between 5 and 50 euros a month. To conclude, ACIs are an example of true elasticity, bringing on-demand power in a fast and reliable way for a low amount of money.