Azure Container Instances Magic

Azure Container Instances (ACI) are a very nice example of what organizations will never be able to do on-premises: scale high and pay peanuts. Let me tell you a real-world story where ACI came to the rescue.

Microsoft describes ACI like this: fast and easy containers and that is really what they are. I have recently been involved in analyzing an application suffering from scalability issues, mostly around memory contention. While scaling out and up the on-premises infrastructure helped, it was only a matter of buying time. It is also important to note that, in Traditional IT, we usually scale up and out but rarely down…meaning that even if you only need more power at some point in time (peak), you will usually pay the whole infrastructure whether it is sleeping or not. The latter statement is less true with containerization, although you still need to buy hardware and computing power is not unlimited, and that’s where Public Cloud providers make the difference. To come back to the application, refactoring the code to make a better use of the memory was not possible because of functional constraints, hence the reason at some point in time (multiple users etc.), memory always became an issue, leading to swapping and to some unexpected application crashes. In a nutshell, the application is merely analyzing large files and performs many different checks to validate the contents, and is written in .NET.

Continue reading

Advertisements
Posted in Azure, Containers | Tagged , , ,

The Azure Architect Map

Hi,

Azure is so broad that it is sometimes difficult to find your way. Although a list of services already exists, I tried to include extra decision factors helping to choose for a solution or another. For instance, if you were to design a Microservices architecture over containers and hesitate between AKS & Service Fabric Mesh, one differentiating factor is how these platforms handle service state. These factors are by no means comprehensive and are even subject to discussion but of course, since it is represented as a mental map, I have to remain high-level and there is no room for arguing.  I have articulated the map around 8 typical concerns when deploying workloads to the Cloud:

  • Monitoring
  • Security
  • Workload types (SoE, SoI, SoX, SoR)
  • Connectivity
  • Identity
  • CI/CD
  • Governance and Compliance
  • Containerization.

I haven’t segregated FaaS, PaaS and IaaS and some services belong to multiple high-level nodes but this should help you identifying some key areas and key services.

 

Here is a screenshot of the map:

map

 

and you can access it in the map format through the following URL:

https://app.mindmapmaker.org/#m:mm7fa610765ea749049a3e3f2845880e25

Feel free to collapse/expand nodes as grasping it right with the image only is a bit challenging.

 

Posted in Azure | Tagged , , , | Leave a comment

From network-in-depth to defense-in-depth in the era of serverless architectures

A traditional way of implementing defense in depth is to rely heavily on the network. Traditional security architects are somehow obsessed by the network and consider it as the primary protection layer whatever asset they want to protect for whatever kind of architecture, to the extent that they transformed the defense in depth principle to a network in depth one.

The second pillar is usually authentication, the third is encryption, and, at last, monitoring comes to supplement these two inevitable layers.

As many of you might have noticed, this kind of approach is perfectly suitable for traditional IT where we have full control over the datacenter, and whose the scale is usually limited. But network is not anymore the primary layer for massively distributed systems and event-driven architectures that are in the realm of serverless. If you think about IoT, about geo-distributed systems, how could you restrict this to a given network perimeter? The principle of defense in depth is to rely on multiple protection mechanisms, network should only be one of them when available.

I recently analyzed the Azure Service Catalog (Network & Management services excluded) and I could get some stats: 18% of the services can be created inside a VNET, 19% can integrate with a VNET (meaning that they can interact with resources that are inside a VNET but themselves are outside) and 65% may be protected by their own firewall. I should add that whenever it is possible to lock things down using network, it will usually make your costs rising and you will lose the true elasticity.

To give a concrete example, if you consider Azure Data Factory with the Microsoft hosted runtime, you can simply consume the service as is but you cannot control the runtime’s IP nor even do you know its IP range. If that runtime needs to (and that’s the purpose of an ETL) access a data store, you can’t use the datastore’s firewall to restrict the access to that ETL only since it doesn’t come with a static IP or IP Range. To mitigate that, you can host the Data Factory runtime, either on-premises, either on Virtual Machine(s). Doing so will not only increase your costs but you will also kill your elasticity (or seriously reduce it since VM scale sets are not as elastic as a native serverless offering) ., while the MS hosted runtime would adapt 100% to the demand, cost-wise too.

Another nice example of pure serverless awesomeness: CDN. A CDN service cannot be locked down network-wise and that’s not the point since it is supposed to serve any user worlwide…some may argue that you usually do not serve sensitive content through CDN and I agree but you can put a CDN in front of a restricted storage account so you never know what users could put inside…Moreover, in the CDN world, there is the famous hotlinking issue where unexpected CDN consumers start consuming your CDN exposed resources…having an impact on both the bandwitdh and the costs (because you will pay the bill), so sometimes CDN can also become a problem. How does Verizon deals with that? Just by letting you define an encrypted token containing some rules the consumer must comply with…not by locking down access to a specific network. This example shows that we have to think out of the box.

So, network-driven security leads for sure to increased costs and defeats the promise of PaaS and FaaS where the typical benefits (time-to-market & costs) are based on economies of scale and on multi-tenancy. Azure is not the sole platform where network-based security is hardly doable, even AWS which is renowned for its network capabilities, does not behave differently with services such as Cloudsearch (Azure Search), SQS (Service Bus), SWF (Logic Apps), Kinesis Firehose and Kinesis Streams (Event Hub), just to name a few. Even AWS Lambda (Azure Functions) were originally designed to run in a non-predefined network perimeter, and although it is possible to run them inside a VPC (like hosting the Azure Function runtime onto an ILB ASE), it’s not recommended from a performance & scalability perspective.

To add on this, the new born Azure Sentinel (preview) is itself part of the serverless offering…isn’t it ironic for a SIEM to be outside of a controlled network perimeter? Hey, I was almost forgetting the panacea: conditional access…Well tried! While Azure Active Directory conditional access is indeed a very good way to control network boundaries, it is far from covering all the scenarios. For instance, any Azure resource that is not subject to Azure Active Directory authentication will not benefit from conditional access, but more importantly, at the time of writing, any clientid/clientsecret having access through RBAC or AAD Apps leveraging the Client Credential flow is not subject to conditional access…so, as a malicious insider, I could just try to grab this pair of credentials and play from whatever network perimeter I want. Should we stop using the Public Cloud because of that? Hell no! The truth is that security people need to reinvent themselves and start considering the network as only one of the elements.

What other protection layers do we have? Identity (whether AAD, or keys), Encryption at rest and in transit, and even better: client-side encryption.  Rotating access keys very frequently, use Managed Identities as much as possible so that passwords/secrets can’t even be leaked…unless Azure itself is cracked!

What else?? Well, and what about the application itself? This is particularly true in serverless since the whole point is that serverless services (Azure Functions, Databricks, etc.) are by design restricted in what they can do towards the underlying hosts. (since of course, there is always a server behind, even in serverless architectures :)). What about true DevSecOps, where you enforce security controls and application code robustness in an automated way through your CI/CD pipelines? By the way, pentests (although not fully adequate with agile methodologies) are still possible too. There are numerous ways to compensate the “loss” of network control.

But don’t get me wrong, I’m not telling you to discard the network totally, I’m rather even in favor of using this protection layer whenever possible, but this should not drive a PaaS and FaaS security strategy since it will simply defeat the whole purpose.  In Digital Transformation Program (where the Cloud usually plays an important role), there is the word Transformation which implies changing habits, reinvent oneself and finding new methods of achieving similar results. Thinking about alternatives is absolutely necessary!

So, to anyone having access to C-level people, conveying such a message is important to avoid waste of time, money and energy. The culture (especially in security) is by far the most difficult aspect to handle. Last but not least: we all work for a business, IT for IT, security for security does not make sense. Our job is to highlight the risks, define the residual risk and let the business take informed decisions about whether they want to take it or not.

Posted in Azure | Leave a comment

Enforcing security controls right from CI/CD pipeline with AzSK – Deep Dive

Azure Security Kit  aka AzSK is a framework that is used internally by Microsoft to control & govern their Azure Subscriptions. While some features are overlapping with Azure Security Center, I find a lot of value in the Kit, mostly in the following areas:

  • The attestation module allowing for a full traceability of security controls deviation and justification of why a given control was not respected, which may be very useful in case of internal/external audit
  • The CI/CD extensions available on the marketplace. This makes possible to enforce security controls as from CI builds, so very early in the application lifecycle. On top of Azure DevOps extensions, the kit also ships with Visual Studio extensions to provide security guidance as the developer is typing the code.
  • An ARM template checker module, available from CI/CD as well as from command line
  • A lot of room for customizing control settings, creating organization specific controls, etc.
  • It is free, the only costs you might have would be incurred by a few resources (Storage Account & Automation) that are required to use the kit but overall, it is very low.

Continue reading

Posted in Azure | Tagged , , , , | Leave a comment

My top 10 guiding principles for a successful Cloud journey

Hi,

Today, most companies have at least some workloads in the Cloud but sometimes at the cost of a long and tortuous journey. Here are some guiding principles that I think are important for a successful or at least, less painful journey. The below sequence is more or less logical but activities relating to different principles could be executed in parallel.

1. Understand well your business drivers.

Continue reading

Posted in Azure | Leave a comment

Enhancing the security of Azure Automation Webhooks in an Azure DevSecOps context

Hi,

Webhooks are a very convenient way to integrate APIs in general and to call Azure Automation runbooks but while they are very useful and easy to work with, they raise some security concerns. To give a concrete example, if you create a webhook against a runbook that leverages Azure Automation Hybrid Workers, causing this runbook to execute against on-premises machines and/or within your network boundaries, you might want to make sure that the webhook consumer is well eligible to do so. Continue reading

Posted in Azure | Leave a comment

Understand the impact of websockets on the Azure Application Gateway

Hi,

websockets are admittedly not the most commonly used technology although they are very useful in every near “real-time” scenario. The thing is this may have a dramatic impact on the behavior of the Azure Application Gateway, mostly regarding the monitoring aspects.

While the gateway works perfectly with websockets, the associated diagnostics may seem wrong at first, especially when sharing a single gateway across multiple backends, not using websockets. You might indeed end up with charts looking like this:

blog

were you see your latency increasing a lot with frequent peeks..So, if you setup an alert on this latency, you might end up with false positives. When digging further, you realize that this abnormal latency is in fact due to websockets. Continue reading

Posted in Azure | Leave a comment