IT Brief UK - Technology news for CIOs & IT decision-makers
Story image
Why machine identity management is key to secure service mesh deployment?
Wed, 15th Feb 2023
FYI, this story is more than a year old

Despite gathering macroeconomic headwinds, the promise of digital transformation remains undimmed. In fact, during challenging business conditions, cloud projects offer much-needed opportunities for efficient growth, productivity enhancements, and greater agility. Kubernetes has emerged as a key component of these initiatives as Platform teams deploy multiple clusters across multi-cloud environments. But the resulting complexity has driven the need for improved governance.

This is where service meshes like Istio come into their own. But delivering on their promise isn’t easy. It will require careful planning, expert guidance, and the use of cloud-agnostic machine identity management tooling.

Introducing service mesh

Kubernetes use has only accelerated in 2022. Data from this year reveals that a record 96% of organizations are either using or evaluating the technology, up from 83% in 2020 and 78% in 2019. Not only has headline usage increased—the way organizations are deploying the tech in projects has also matured. Yet as they build out clusters across clouds, the need for oversight becomes more acute.

Service meshes offer an increasingly popular option – acting as a separate infrastructure layer sitting on top of Kubernetes clusters, they offer several network connectivity and security-related features for those clusters. These include mutual TLS (mTLS) for transparent service-to-service encryption using TLS certificates. Out–of–the box, this enables all communications between workloads to talk to each other. In addition, because all traffic flows through the service mesh, it allows for deeper observability – including traceability of pod-to-pod requests and performance insight. Users also benefit from more deployment options, traffic customization, and circuit breaking – for example, in a situation where pods can’t communicate.

While there are multiple service mesh vendors out there, Istio and Linkerd are perhaps the most widely recognized today. Acting as a transparent, language-agnostic framework, they deliver all those service mesh benefits of uniform observability, traffic management, and policy-driven security – doing a job that might otherwise require multiple-point solutions.

The roadblocks to Istio deployment

While these benefits are undeniable, so are the challenges that go along with implementing Istio, or indeed any service mesh architecture. This can put off many organizations that lack the time, money, and in-house skills to support such projects. One challenge relates to sidecars. Every pod has a main application container and a sidecar container which contains the service mesh proxy – it’s the secret sauce that enables organizations to harness all those service mesh benefits, as this is where network traffic from the pod’s app container is directed to.

However, sidecars add latency and take up CPU and memory resources, which can be a major headache at scale. Even if each sidecar only takes 5MB of memory and 0.1 CPU, multiplied by 100,000 pods, it would represent an enormous resource drain. Istio, with its use of the open-source Envoy proxy as the sidecar, is particularly susceptible to this. Linkerd opted to develop its own lighter weight, Rust based, proxy sidecar and provide compelling research suggesting its latency impact and resource usage is much lower. In a tacit acknowledgement of the limitations of their sidecar proxy approach, Istio has announced the preview of ‘Ambient Mesh’, which will provide a node-based proxy that takes on some of the functionality previously provided by Envoy.

Great strides have been made to make it easier to deploy Istio and service meshes in general, but it’s still incredibly complex to troubleshoot connectivity issues and configure the architecture, especially in large environments. There are also unintentional operational headaches introduced by mTLS. Although it works well with HTTP traffic, mTLS becomes more problematic with TCP traffic – for example, if a pod needs to talk to a database.

Getting started

The truth is that Kubernetes comes with a tremendously steep learning curve. Add a service mesh on top of that, and the curve is only going to get steeper. The first requirement is to create a list of goals that the organization is hoping to achieve from using a service mesh and prioritize these items. Decide what it’s going to be used for: primarily mTLS and north-south or east-west ingress gateways? Consider how it should be used when a pod needs to communicate with an external service – like a database not located in the same cluster. How will traffic be encrypted, if desired? What should the traffic routing policies be?

Machine identity management should also be addressed. Each Service Mesh control plane will have a component that deals with certificate management. When the control plane is installed, it creates a root certificate authority (CA) for each cluster, which generates certificates signed by that CA. But having a self-signed root CA certainly isn’t the best practice, especially in highly regulated sectors like financial services. For organizations with multiple service meshes (and self-signed root CAs) in operation, the issue multiplies. Organizations must also remember that pods have a relatively short shelf-life, and each one will need its own certificate.

Self-signed certificates don’t offer the visibility, and control security teams demand. They aren’t signed by a publicly trusted CA, cannot be revoked and never expire. These are all security red flags. Teams that are serious about security best practices, therefore, need to investigate in a cloud-agnostic, automated way to manage the full process of identity issuance and lifecycle management – using a control plane for cross-cluster visibility and configuration. Tools like cert-manager and Jetstack Secure help to deliver the security and visibility organizations need in their service meshes while mitigating compliance risk and freeing up developer time.

Machine identity is by no means the only roadblock to effective service mesh deployment, but it’s an important step. With the right professional guidance and a carefully planned approach, organizations can take giant strides in their digital transformation projects.