Audit Your Azure Resources With Resource Graph
Use the Resource Graph to get a sense of your Azure environment before implementing controls needlessly.
Join the DZone community and get the full member experience.
Join For FreeI recently delivered a session at Microsoft Ignite The Tour in London around governance in Azure. One of the critical points in this session is that before you try and implement any controls around resources, cost, or security, you need to have a good understanding of what your Azure estate currently looks like and what resources you are making use of, so you know where to focus your effort.
There is no point spending lots of time implementing policies to restrict which size web apps you can deploy if no one is using them. Similarly, if you go ahead and lock down the allowed sizes of VMs to D2-16 V3, but 70% of your deployments are still actually using the V2 SKU you are going to have some angry users. Before you implement any of these changes, it is essential that you have a good grasp on how your company is using Azure today. A great tool to help do this is Resource Graph.
Resource Graph is a command line tool that allows you to quickly and easily query your whole Azure estate using the familiar Kusto query language that is used in Log Analytics and App Insights. If you have never used resource Graph before I have a primer on getting set up and running your first queries here. For the rest of this article, we will look at how to use Resource Graph as part of our audit and data collection process and some useful queries to help you get a handle on your current Azure usage.
First things first, let's get a simple overview of what resource we are using the most of in Azure. First off, let's get a list of resource types, ordered by count:
CLIaz graph query -q "summarize count() by type| project resource=type , total=count_ | order by total desc" --output table
Search-AzureRmGraph -Query "summarize count() by type| project resource=type , total=count_ | order by total desc" | Format-Table
Which results in a summarized table like this:
Resource Total
------------------------------------------------ ------- 38 29
microsoft.automation/automationaccounts/runbooks 26
microsoft.compute/disks 15
microsoft.web/sites 13
microsoft.compute/virtualmachines/extensions 13
microsoft.operationsmanagement/solutions 10 10
microsoft.web/serverfarms 9
microsoft.insights/components 9
microsoft.keyvault/vaults 8 7
microsoft.compute/virtualmachines 7
microsoft.compute/images 7
microsoft.sql/servers/databases 6 6 6
microsoft.insights/alertrules 6
microsoft.insights/actiongroups 4
microsoft.web/connections 4
microsoft.automation/automationaccounts 4 2
microsoft.managedidentity/userassignedidentities 2
microsoft.logic/workflows 2
microsoft.eventhub/namespaces 2
microsoft.sql/servers 2
microsoft.documentdb/databaseaccounts 2
microsoft.streamanalytics/streamingjobs 2
microsoft.operationalinsights/workspaces 2
microsoft.recoveryservices/vaults 2 2
microsoft.containerservice/managedclusters 1
microsoft.containerregistry/registries 1
microsoft.devtestlab/schedules 1
microsoft.portal/dashboards 1
microsoft.sql/servers/jobagents 1 1
microsoft.compute/availabilitysets 1 1
microsoft.aad/domainservices 1
microsoft.containerinstance/containergroups 1
microsoft.compute/restorepointcollections 1
Now we have a better idea of what resources are in use in our organization. The next question is, where are these resources being used? Do we have some specific regions where perhaps we ought to focus or deployments and look at things like reserved instances?
az graph query -q "summarize count() by location | project total=count_, location | order by total desc " --output table
Search-AzureRmGraph -Query "summarize count() by location | project total=count_, location | order by total desc " | Format-Table
This query breaks our resource count down per region. In the example below, we can see that our focus is clearly in West Europe.
Location Total
------------------ -------
westeurope 219
global 6
uksouth 6
eastus 5
northeurope 5
westus2 2
eastasia 2
westindia 1
koreasouth 1
westus 1
ukwest 1
southeastasia 1
koreacentral 1
eastus2 1
japanwest 1
centraluseuap 1
australiasoutheast 1
francecentral 1
centralus 1
eastus2euap 1
centralindia 1
canadacentral 1
canadaeast 1
australiaeast 1
southindia 1
northcentralus 1
brazilsouth 1
westcentralus 1
southcentralus 1
japaneast 1
Whilst we can view the types and locations of resources in use easily, one thing that is much harder is grouping by owner, or purpose for these resources. One way to do this is to implement tags in your environment that denote these sort of things, which then makes it much easier to query this information. For example, if you implement an "owner" tag for all resource then you can query by that to see who owns the most resources:
az graph query -q "summarize count() by tostring(tags.owner) " --output table
Search-AzureRmGraph -Query "summarize count() by tostring(tags.owner) " | Format-Table
As you can see, I'm not very good at applying this tag!
Count_ Tags_owner -------- ------------ 252 16 sam
Virtual Machines
We've looked at overall resources, so let's drop down and look at some resource specific queries now, starting with VMs. We already mentioned that we want to get a better understanding of which VM types are in use, so we can do a quick query to get that information.
az graph query -q "where type =~ 'Microsoft.Compute/virtualMachines' | project SKU = tostring(properties.hardwareProfile.vmSize)| summarize count() by SKU" --output table
Search-AzureRmGraph -Query "where type =~ 'Microsoft.Compute/virtualMachines' | project SKU = tostring(properties.hardwareProfile.vmSize)| summarize count() by SKU" | Format-Table
Which gives us a table grouped by VM type:
SKU Count_
--------------- --------
Standard_DS1_v2 4
standard_D1_v2 2
Standard_DS2_v2 1
OS Type
We can also query the VMs by OS to see how many Windows and Linux machines we have running:
az graph query -q "where type =~ 'Microsoft.Compute/virtualMachines' | summarize count() by tostring(properties.storageProfile.osDisk.osType)" --output table
Search-AzureRmGraph -Query "where type =~ 'Microsoft.Compute/virtualMachines' | summarize count() by tostring(properties.storageProfile.osDisk.osType)" | Format-Table
OS Count_
------- --------
Linux 3
Windows 4
If we are only interested in Windows VMs we can adjust our query above to only include them.
az graph query -q "where type =~ 'Microsoft.Compute/virtualMachines' | where tostring(properties.storageProfile.osDisk.osType) =~ 'linux' | project SKU = tostring(properties.hardwareProfile.vmSize)| summarize count() by SKU" --output table
Search-AzureRmGraph -Query "where type =~ 'Microsoft.Compute/virtualMachines' | where tostring(properties.storageProfile.osDisk.osType) =~ 'linux' | project SKU = tostring(properties.hardwareProfile.vmSize)| summarize count() by SKU" | Format-Table
Managed Disk
Another useful query is to get a list of machines that are not using managed disks; this allows us to see what impact enforcing a policy of only allowing managed disks will have, and perhaps warn people who use them that it is coming.
CLIaz graph query -q "where type =~ 'Microsoft.Compute/virtualMachines' | where isnull(properties.storageProfile.osDisk.managedDisk) "
Search-AzureRmGraph -Query "where type =~ 'Microsoft.Compute/virtualMachines' | where isnull(properties.storageProfile.osDisk.managedDisk) "
A similar query can be used to query other properties of VMs, such as whether they are using Managed Identity or have boot diagnostics enabled.
Account Type
Another common resource we can look at is storage accounts. First off, let's so how many of each storage account type we have (V1, Blob Only, V2). This allows us to asses the impact of forcing V2 storage.
CLIaz graph query -q "where type =~ 'Microsoft.Storage/storageAccounts' | summarize count() by kind" --output table
Search-AzureRmGraph -Query "where type =~ 'Microsoft.Storage/storageAccounts' | summarize count() by kind" | Format-Table
As you can see I have a mixture of all 3:
Count_ Kind
-------- -----------
18 StorageV2
19 Storage
1 BlobStorage
We can also have a look at some of the security features of our resources and see if they are enabled, or whether we need a policy to enforce them. For example, do our storage accounts have encryption enabled? This query lists all storage accounts that do not have Storage Service Encryption enabled.
CLIaz graph query -q "where type =~ '' | where aliases['Microsoft.Storage/storageAccounts/enableBlobEncryption'] =='false'| project name"
Search-AzureRmGraph -Query "where type =~ '' | where aliases['Microsoft.Storage/storageAccounts/enableBlobEncryption'] =='false'| project name"
Similarly, we can look to see of HTTPS traffic only is enabled:
CLIaz graph query -q "where type =~ '' | where aliases['Microsoft.Storage/storageAccounts/supportsHttpsTrafficOnly'] =='false'| project name"
Search-AzureRmGraph -Query "where type =~ '' | where aliases['Microsoft.Storage/storageAccounts/supportsHttpsTrafficOnly'] =='false'| project name"
I could go on for much longer with more examples of queries that help you audit your subscriptions, but hopefully, this gives you an idea of the power of using Resource Graph for auditing your subscriptions and getting a good picture of the current state before you try and improve it through the governance tools available in Azure. Undertaking this sort of audit allows you to understand the impact your changes are going to have, but also to set a baseline to asses your improvements against. If you don't know what you started with, how will you know if it is getting better?
If you have any useful queries you're using in Resource Graph, please do post them in the comments, I would love to start building a collection of these.
Published at DZone with permission of Sam Cogan, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.