Skip to content

Provisioning Infrastructure using an Infrastructure Operator and GitOps

Business problem

Organizations with growing or complex platforms often find it challenging to manage the underlying infrastructure securely and reliably, while also enabling application teams to work efficiently. Application teams often must use resources and consume services created and maintained by other teams, and it can be challenging to understand how to deploy, configure, and manage those dependencies. Such work often requires coordination across teams and manual steps, leading to communication gaps, timing issues, mistakes, bugs, and quality issues. Automated orchestration is a good way to reduce complexity of deploying and managing infrastructure. Without automated orchestration, provisioning infrastructure can become onerous and resource intensive.

Solution

An infrastructure operator allows you to provision and manage infrastructure using Kubernetes as the control plane. It's based on the Kubernetes Operator model which extends the functionality of the Kubernetes API with custom resources and controllers. Combined with GitOps, this solution enables you to efficiently abstract complex tasks and automate resource allocation and maintenance such as automated deployments, continuous delivery, and easy rollback capabilities.

Value proposition

Consistency and reliability with declarative infrastructure management - A Kubernetes infrastructure operator allows you to define your infrastructure declaratively using YAML manifests. The operator does the work to provision and configure the infrastructure described in the manifest. Reliability is increased because the infrastructure is defined declaratively in git-managed file, enabling a full audit history of changes, and easy rollbacks. This approach ensures that the desired state of your infrastructure is defined explicitly, making it easier to understand and manage changes. Advanced enterprises may store their Kubernetes configurations, including the infrastructure definition, in dedicated GitOps repos, which allows cluster provisioning and configuration to be managed independently of the hosted applications.

Automation and abstraction - The Kubernetes infrastructure operator pattern encapsulates domain-specific knowledge and best practices, allowing you to implement abstractions for complex infrastructure management tasks into custom resources. Such abstractions can accelerate application development teams by removing requirement for platform-specific knowledge. With GitOps, changes to infrastructure configurations are automatically applied to the Kubernetes cluster based on the commits made to the git repository containing the infrastructure manifests.

Infrastructure visibility and auditing - GitOps provides a centralized location to view and manage infrastructure configurations. A GitOps repo allows all team members to view the current configuration, and PRs are used to propose, review, and commit changes. A repo provides a historical record of all configuration changes, enabling a full audit history.

Multi-cloud - The Kubernetes infrastructure operator pattern allows you to customize and extend Kubernetes to fit your specific infrastructure needs in different clouds and hybrid deployments.

Multi-cluster - GitOps allows enterprises to manage multiple Kubernetes clusters in a single version-controlled Git repository, simplifying multi-cluster management.

Ecosystem and community - Kubernetes operators have gained popularity, and as a result, there is a growing ecosystem of pre-built infrastructure-related operators available for various applications and services. You can use these existing operators to deploy complex infrastructure with minimal effort.

Logical architecture

The following logical architecture diagram illustrates the use of an Infrastructure Operator and other components required to build a Kubernetes-based control plane for provisioning and orchestrating infrastructure.

Infrastructure Control Plane - logical diagram

Management cluster

A fundamental component of this solution is a dedicated Kubernetes cluster that provides the control plane that is used to manage infrastructure. The custom resources within the Kubernetes cluster define the desired state of the infrastructure it is controlling.

GitOps

A Git repository serves as the version-controlled central source of truth for the desired infrastructure state. The GitOps repository stores YAML manifests and configuration files that define the infrastructure components, their desired states, and any configuration changes. This GitOps repository is what is used to control the provisioning of the infrastructure.

A GitOps Operator in the management cluster watches the GitOps repository for changes. When changes are detected, the operator reconciles the state in the cluster with the desired state specified in the repository. The outcome is that the custom resources in the management cluster are updated to match the desired state specified in the GitOps repository. The state store in the management cluster now matches the desired state of the infrastructure that was defined in the GitOps repository.

Infrastructure

Infrastructure operators in the management cluster watch custom resources, which represent the infrastructure components, for changes. When changes that have been propagated from the GitOps repository are detected, the infrastructure operator will communicate with the appropriate infrastructure provider to action the actual infrastructure changes requested.

Infrastructure operators may support a single cloud (e.g., Azure) or multiple (e.g., hybrid, Azure, AWS, Google, edge).

Implementations

You will need to consider the following when using/implementing a solution that uses an infrastructure operator with GitOps:

  • Simplicity of infrastructure: If the infrastructure is relatively simple, custom Operators might introduce unnecessary complexity.
  • Network connectivity: Complex network topologies may present more challenges and will likely require extra components to for necessary orchestration, such as proxies to connect clusters to the control plane.
  • Expertise: Developing and maintaining Kubernetes Operators or using 3rd-party operators may require specialized knowledge and expertise. If a team lacks the necessary expertise, it might be more practical to rely on other infrastructure management approaches that better align with the team's skill set.
  • Established infrastructure management tools: If an organization already has well-established infrastructure management tools or systems in place, introducing Kubernetes operators might add unnecessary overhead and complexity. In such cases, it's essential to evaluate the trade-offs and potential benefits before adopting the operator model.
  • Limited long-term support or community adoption: When considering using 3rd-party operators, it's crucial to evaluate the support and community adoption of those operators. If an operator isn't actively maintained or lacks community support, it could lead to issues and challenges in the end.

Provision Azure Kubernetes Service clusters with Cluster API

This implementation uses the Cluster API Infrastructure Operator which is a specialized operator that only deploys Kubernetes clusters via infrastructure providers into different environments.

AKS Management Cluster with Cluster API

This implementation uses AKS as the management cluster. The management cluster is configured to use the Cluster API (CAPI) infrastructure operator and Cluster API Provider Azure (CAPZ) infrastructure provider to provision additional AKS clusters. Flux is a GitOps operator that is used within the provisioned AKS clusters to further configure them.

View GitHub repo

Provision Azure resources with Crossplane

This implementation uses the Crossplane Infrastructure Operator and the Flux GitOps Operator. Crossplane is a generalized operator that allows for deploying infrastructure resources across all the major cloud providers.

AKS Management Cluster with Crossplane

This implementation uses AKS as the management cluster. This cluster is configured to use Crossplane and the Azure Provider to provision an Azure resource group, an AKS cluster, and Azure policies. Flux is used as the GitOps operator.

View GitHub repo

Learn more