Skip to content

Automate permissions with Azure data services

One common challenge when automating the deployment of Azure data services is the separation between the Control Plane and the Data Plane. This separation exists in various Azure data services such as Azure Synapse, Microsoft Purview, Azure Databricks, and Azure Cosmos DB.

Having separate data plane and control plane means Role-based Access Control (RBAC) roles available in the control plane only Create, Read, Update, Delete (CRUD) operations for the service. Thus, operations include actions such as deleting the entire service, enabling or disabling features, changing the SKU, and similar tasks.

Data plane operations, in contrast to other types of operations, require specific handling depending on the service. For example, Azure Databricks uses access control lists to configure permissions to access workspace objects. As a result, automating role assignments during deployment within CI/CD pipelines requires additional development effort when Azure CLI, Bicep, or Terraform don't directly support it.

Automate role assignments for Azure data services

This article explores various approaches to automate role assignments for Azure data services.

Managing Roles in Azure Databricks and AKS using CLI

Services like Azure Databricks and Azure Kubernetes Service (AKS) have a dedicated CLI to handle role management. For Databricks, the friction point lies with the need to use databricks CLI in Azure Pipelines for automating access. While for AKS there's currently an attempt at removing the friction via Azure CLI and Bicep, the problem is still unsolved for Databricks.

The Databricks Data Plane is used to control access to resources like clusters, jobs, secrets or Unity Catalog to name a few. Bicep only supports the Azure Control Plane (management.azure.com) since it's only a layer on top of Azure Resource Manager. Because Unity Catalog isn't exposed as a resource in the Azure Control Plane, there's no easy path to execute Unity Catalog configurations from Bicep. It is only be done by using the Databricks CLI or the Terraform provider for Databricks. When using Bicep for platform deployment, extra Bash scripts must be executed on self-hosted agents to configure anything at the Databricks Data Plane level. Only the Databricks workspace can be configured and created using ARM/Bicep.

Managing data plane roles in Azure Cosmos DB using Azure CLI

For Azure Cosmos DB NoSQL API, data plane roles can be handled via Azure CLI. Having support through Azure CLI offers a positive developer experience for permission automation in deployment pipelines since it can be achieved using AzureCLI@2 tasks.

Managing Azure Synapse Analytics roles

Azure Synapse uses several Data Plane RBAC Roles to enforce fine-grained access control on resources in workspaces. Synapse role assignment can be automated through Azure CLI and using AzureCLI@2 tasks in deployment pipelines.

Automating Microsoft Purview role assignments

Automating role assignments for Microsoft Purview currently requires extra development effort. Consider the following use case.

A common use case for permission automation within CI/CD pipelines is handling the assignment an Azure AD group or an MSI to the Data Reader role with a scope limited to a specific Data Collection. This role is required to view assets in the Data Catalog. For a full list of roles refer to Access Control in Microsoft Purview.

Currently, to fulfill automating role assignments use the Metadata Policy API. In order to achieve a use case like the one explained above, the API requires a good amount of extra development work. To overcome this problem, see this code sample that uses GitHub Action to set Purview permissions.

For more information