Getting Started With vCheck
If you use a vSphere infrastructure, then vCheck can help. The vCheck framework is designed to run checks against your infrastructure to determine operational issues.
Join the DZone community and get the full member experience.
Join For FreeIf you use vSphere and particularly vCenter, you’re probably at least familiar in passing with PowerCLI, a package of snap-ins and modules for PowerShell. This is my preferred language for interacting with the vSphere/vCenter APIs since it has (in my opinion) the best documentation of the available languages and API SDKs. If not, I recommend downloading it and playing with it, it can really help you automate many of your repetitive tasks with less Flash and less right clicking.
One of the most popular tools built with PowerCLI is vCheck. It’s a framework for running a number of checks against your vSphere infrastructure and determining what operational issues are present — something every Ops team needs. It won’t replace a monitoring system such as vROps or even Nagios, but it augments such systems very well. For example, it can report on VMs that have ISOs attached, or where snapshots have been present for more than 7 days – issues that probably aren’t worth paging anyone out for, but need to be dealt with eventually. Many of us have built some homegrown solutions for this, maybe even with PowerCLI, but it is difficult to beat a tool designed to meet the needs of a large percentage of vSphere users, is actively developed by VMware employees, and is a framework that you can extend with instance-specific needs. You can always run your tools and vCheck together, too.
Let’s take a look at vCheck and how to get started with it today. We’ll download it, configure it, schedule it as a daily task, review how to enable and disable checks, and store your configuration in version control. This provides a solid base that you can tweak until it fits your specific needs just right.
Download and Configuration
The GitHub home page for vCheck is here and includes a very lengthy README. I suggest you take a few moments to skim that, then come back here and we can dive into it in detail. I also have my own fork of vCheck which we’ll reference a little later. We can clone the repo using git, if you’re familiar with that, or download a ZIP file. It’s under Clone or Download, Download ZIP, or directly at this link. Either way, get a copy on a windows machine that has PowerCLI installed. Our workstation should suffice for now, though, eventually, we’ll want it on a server so it can run as a scheduled task. I’ve download it to C:\PowerShell\vCheck.
Open up a PowerCLI window, or a Powershell (ISE) window with a profile that loads the PowerCLI snap-ins and modules, and cd to where we downloaded vCheck. Let’s start by configuring vCheck with the command .\vCheck.ps1 -Config. It’s going to ask us a LOT of questions. Don’t worry! The really important ones are up front – whether and where to email the report, and the vCenter hostname it will run against. For now, we can just hit enter all the way through to the end so the configuration completes. Here’s what it looks like:
When configuration is complete, vCheck will start running all its checks (currently 109), starting by requesting credentials for vCenter:
The report can take a while to run if we have a sizeable setup. Let it run the first time and a browser window with our report should pop up, along with an email explaining all the ways our infrastructure is messed up — even if it isn’t. Don’t worry, we didn’t do much configuration, we just wanted to make sure it would run, first! If it takes too long (it took a few minutes at home but over an hour at work) just hit ctrl-c and it will finish the current task and end, though we won’t get a report. Here’s what the report looks like:
Once the initial configuration is done, we can call vCheck anytime with .\Vcheck.ps1, without any flags at the end, using the current settings.
Now we want to actually configure it. First, identify which plugins (checks) we do not need at all. For instance, if SRM or VSAN are not in use, disable all of those plugins. Unfortunately, the section headers in the report may not match the name of the corresponding plugin. It’s okay, vCheck includes tools for that as well. The file vCheckUtils.ps1 contains the functions Add-vCheckPlugin, Get-vCheckPlugin, and Remove-vCheckPlugin. Run . .\vCheckUtils.ps1 (mind the leading period and space) to import those functions into our session. Next, run Get-vCheckPlugin -installed to see what plugins exist. I’ve filtered out just one SRM plugin so it fits on a screen. Then Remove-vCheckPlugin “<name>” disables the plugin:
After we identify the plugins we do not need, we’ll want to keep track of that list and update it in the future, especially if we want to run vCheck with the same settings on another node, or have to rebuild the node where it runs. To do so, we’ll create a small script that has an array of disabled checks, determines if it is enabled, and either disables the check or tells us it is already disabled. Here’s what the script looks like, keeping in mind that we want to customize the array of names. We should add the script to our version control, then when we need to disable another plugin, it’s as easy as adding another plugin name to the list and running the script.
c:
cd \Powershell\vCheck
. .\vCheckUtils.ps1
$DisabledPlugins = @(
"vCenter Sessions Age",
"ESXi hosts which do not have Lockdown mode enabled",
"VMKernel Warnings",
"Lost Access to Volume",
"Checking VM Hardware Version",
"No VM Tools",
"Virtual machines with incorrect OS configuration",
"Mis-named virtual machines",
"Hardware CPU/MMU virtualization configuration",
"Site Recovery Manager - RPO Violation Report",
"Single Storage VMs",
"VSAN Datastore Capacity",
"VSAN Configuration Maximum Disk Group Per Host Report",
"VSAN Configuration Maximum Magnetic Disks Per Disk Group Report",
"VSAN Configuration Maximum Total Magnetic Disks In All Disk Groups Per Host Report",
"VSAN Configuration Maximum Components Per Host Report",
"VSAN Configuration Maximum Hosts Per VSAN Cluster Report",
"VSAN Configuration Maximum VMs Per Host Report",
"VSAN Configuration Maximum VMs Per VSAN Cluster Report"
)
$DisabledPlugins | ForEach-Object -Process {
$Plugin = $_
if (Get-vCheckPlugin -Installed -Name $Plugin) {
Remove-vCheckPlugin -Name $_
Write-Output "Removed plugin '$Plugin'."
}
else {
Write-Output "Plugin '$Plugin' was not present anyway."
}
}
Next, we need to configure the remaining plugins. Run .\vCheck.ps1 -config again. We only have to answer questions where we want to make a change, hitting enter accepts the current value. Each plugin has its own specifics and we will need to do some investigation at this step. The config questions are usually helpful with this (such as how many VMs per datastore or vCPU per VM are allowed) and we can open and review the plugin source code if in doubt. One thing that is common across many plugins is a question about exclusions. In my report above, my cluster received some warnings under HA configuration issues. If we pretend that I had multiple clusters, I might have reason for one of those clusters to not have HA configured and want HomeLab to be excluded from this plugin. During the config, look for this section and question:
WARNING:
HA configuration issues
# HA Configuration Issues, do not report on any Clusters that are defined here
# [Example_Cluster_*|Test_Cluster_*]:
The exemption values are typically used to filter out some entries using the PowerShell match operator, using regular expressions (as in plugin 10 HA Configuration Issues). The default value matches any cluster with a name that includes the strings “Example_Cluster_” or “Test_Cluster_” followed by at least one other character. In my case, I could use “Lab”, “HomeLab”, or even “meLa”, as those would all match the cluster name. I’ll settle for “Lab” in this case.
Answer all the configuration questions and vCheck will run once more, with our streamlined list of plugins and updated settings. Keep tweaking the settings until the report is only alerting us to issues that are valid, but isn’t hiding anything, too. This probably won’t happen in a single day, so just come back to the disabled plugin list and config settings as needed.
Bonus Configuration: Credentials
I mentioned earlier that I have a fork of vCheck. The current instructions for vCheck include steps to add our credentials so that we can run this as a non-interactive, scheduled task, under the heading Adjusting connection information. This seems a little clunky to me and there’s no agreed upon right way to do this. To address this, I created an automation branch and submitted a PR with my solution: store the credentials in an .xml file using SecureStrings. If we download this branch, when we run vCheck with the -Config option, it will do two things different. 1) It will not run vCheck after configuration. We either run vCheck or configure it, not both. 2) It asks us for a credentials storage location, then at the end asks for credentials which are then stored securely in that location. I, of course, am a fan, but as the PR has not been reviewed yet, it may not be the best solution. It is, however, automated!
Note: SecureStrings are stored per-principal, per-machine — that’s part of what makes them secure. The “Run As” elevation of Windows does NOT affect this! That is, if my user, rnelson0, opens Powershell with Run As Administrator, the administrator user can NOT read SecureStrings that are created – only the user rnelson0! You will need to actually log into the machine as the user who will retrieve the credentials. This is important when we schedule the task, as that user will need to read the credentials. This applies to any use of SecureStrings, even if you do not use my automation branch.
If you don’t use my branch, follow the vCheck instructions and otherwise ensure that vCheck can run with stored credentials without requiring human interaction before continuing.
Scheduling a Task
Any sort of reporting tool isn’t very useful if someone has to remember to run it. We humans are kinda bad at that – we forget, we go on vacation, our machines break, etc. We want this report to run on a regular basis, probably a daily or weekly basis. There are a number of ways to add this to the Windows Task Scheduler, but we want to automate everything, right? Good news! We can do that with the *-ScheduledJobs cmdlets in PowerShell. Here’s a repeatable script to do this (it’s in the dev branch of vCheck now and will show up in vCheckUtils.ps1 soon!). Run this in an elevated (Administrator) PowerShell instance and then run Schedule-vCheck:
function Schedule-vCheck {
$vCheckJobName = Read-Host -Prompt "Enter the name of the vCheck job to create"
$TriggerTime = Read-Host -Prompt "Enter the time $vCheckJobName should run at, in the format 'H:MM AM/PM' (e.g. '2:00 AM')"
$Location = Read-Host -Prompt "Enter the fully qualified location where the vCheck script resides"
$sb = [scriptblock]::Create($Location)
$dailyTrigger = New-JobTrigger -Daily -At $TriggerTime
$option = New-ScheduledJobOption -StartIfOnBattery -StartIfIdle
Register-ScheduledJob -Name $vCheckJobName -Trigger $dailyTrigger -ScheduledJobOption $option `
-ScriptBlock $sb
}
Hooray, vCheck will now run once a day and email us a report when it’s done! We can use the *-ScheduledJob cmdlets to modify the job or, now that it exists, use Task Scheduler to modify it, under Task Scheduler Library, Microsoft, Windows, PowerShell, Scheduled Jobs. Output from the jobs, including errors, will be found in %USERPROFILE%\AppData\Local\Microsoft\Windows\PowerShell\ScheduledJobs\<Job Name>. It is, unfortunately, in XML; if we find the task is not completing successful and have to review the logs it can be a little painful.
Save Our Settings
Finally, we need to save our settings. vCheckUtils.ps1 comes to the rescue again with the *-vCheckSettings functions. Export-vCheckSettings creates a CSV file with our settings and Import-vChecksettings imports them. It’s important to note that the Import util will struggle if our CSV file doesn’t have settings for an enabled plugin, so we want to keep this with our disabled plugins script and run that first. You will also notice I used PowerShell in this screenshot, not PowerShell ISE, because the functions use some color tags that aren’t present in ISE – hopefully this will be fixed soon.
If we’ve configured vCheck on our workstation, we can use this to deploy vCheck on a server and synchronize settings between the two. Make sure the settings and disabled checks are maintained to version control, to track changes and avoid losing highly customized settings.
We now have a functioning and slightly tuned vCheck report that runs daily. Take the time to get the settings right and then make sure the entire Ops team has access to the reports. vCheck can help to quickly identify significant problems or regressions in the infrastructure, as well as some capacity issues that other systems may not pick up. Enjoy!
Published at DZone with permission of Rob Nelson, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments