PowerShell quickie: space used per database by table objects

Automation can take many forms

Sometimes, the most useful piece of automation you can write is the one that gives you the most useful info, quickly. It’s true that with PowerShell, you do have a powerful SQL Server automation framework at your fingertips. That doesn’t have to mean that your automation solutions need to be clever or complex. They just need to do something useful and that saves you time. One such task can be something as “mundane” as collecting information about your databases and instances.

Here’s a quick scenario: you recently received a disk space alert from your monitoring software, and one of your data drives is running low on free space. That might not be surprising; we’re always storing more and more data. However, what you’re unsure of is exactly where that space is being used. You might fire up management studio and take a look in object explorer and inspect your space used column when you look at your list of databases. Maybe one looks larger than it ought to be? So you move on and keep checking tables in other databases… but maybe you need to add columns to object explorer.

Maybe Management Studio freezes for a few seconds. Then you keep flipping back and forth between databases, trying to aggregate information. Maybe you get frustrated with all the clicking and start to write a script that outputs usage information from some system DMVs. After that, you decide you want to compare all the databases, so you write a cursor. Now you feel like you need a shower. If only there were a better way.

When some people hear the word automation, they feel like it’s reserved for some mythical, “devops-y”, spin-up-configure-tear-down scenario. The best kind automation, though, is the kind that eliminates busy work, not just automates a complex task. Why shouldn’t getting instance-level information be any different? Not all automation needs to “do” something; it can just return information that can be otherwise hard to aggregate.

The objects of my desire

Of course, PowerShell excels at this. By using the SQL Server module, it’s really easy to:

  • Connect to an instance and collect every user database, and
  • From each database, collect every table, and
  • For each table, collect row counts and space used, and
  • If there are any indexes, group them, and sum their usage and report that as well

Here’s the script. Note that I have the server name hard-coded in there as localhost (more on that in a coming paragraph). Go ahead and take a look before we break it down.

Import-Module sqlserver

$Databases = Get-ChildItem -Path SQLSERVER:\SQL\LOCALHOST\DEFAULT\Databases

$Output = @()

ForEach ($d in $Databases) {

    ForEach ($t in $d.tables) {
        $Object = [PSCustomObject] @{
            ServerName = 'LOCALHOST'
            DatabaseName = $d.Name
            DatabaseSize = $d.Size
            TableSchema = $t.Schema
            TableName = $t.Name
            RowCount = $t.RowCount
            DataSpaceUsed = $t.DataSpaceUsed
            IndexSize = ($t.Indexes.SpaceUsed | Measure-Object -Sum).Sum
        }

        $Output += $Object
    }
}

The first thing we need to do is make sure we import our SQL Server PowerShell module. That’s right, this script requires it. If you haven’t downloaded it yet, what are you waiting for?

Once the module is imported, we’re going to return an array of objects that represent the databases on our instance. We do that with a good ol’ Get-ChildItem  cmdlet here, and use the SQLSERVER:\ path that the loading of the module created for us. We’re going to store that collection of objects in the $Databases  variable. We’re also creating an empty array object called $Output  by setting the variable’s initial value to @() .

Next, we’re going to start a For-Each  loop to iterate over each database object. A database object returned from the provider path has many properties. Some are simple integer or strings, and some are other collections of objects. Just like in Management Studio how tables are “children” of a database in the object explorer tree view, a database object in PowerShell has table objects returned as well. We can access these inside our ForEach loop with the .Tables  property. That’s what we’re doing when we open up another ForEach inside of there. We’re going to loop over each database, and each table in each database.

Next, we’re going to declare a hash table to hold our results. We’re declaring a variable named $Object  that is essentially a user-created object type to hold our results. We’re going to store the server name that we’re working on (again, hard coded on purpose… for now), the name of the database we’re working, the size of the database, and then some information about the current table we’re looking at. All of these are object properties, which is why we access them with things like .Name  and .Schema  and .Size .

The last thing we collect is information about indexes. Since a table can have multiple indexes, we need to “roll-up” the space used by each index into one value. To do that, we’re using an expression with pipes and then collecting a property. Let’s take a close look at this line:

IndexSize = ($t.Indexes.SpaceUsed | Measure-Object -Sum).Sum

We’re going to work this expression from the inside out. Since there are parenthesis around part of this statement, we’re going to start there. First, we select the current table, indicated by $t  from our ForEach loop, and select the .Indexes  property. This returns a collection of indexes for the table. For this task, we only want the total size of all our indexes. So we call .SpaceUsed  on that property too. When you’re working with objects, you can keep calling properties all the way out as far as you can.

If we stopped there, we’d be left with a collection of index sizes. To roll these up, we’re going to pipe the collection of index sizes to the Measure-Object  cmdlet. We want a sum of their sizes, so we’ll use -Sum  parameter to tell PowerShell we want a sum of what we passed it. Since we’re doing all of this in a set of parenthesis, we can call .Sum  on the entire expression to get the value returned by Measure-Object .

And finally, before we start closing our loops, we take the object we just created and tack it onto our array we created at the start. When the script finishes running, we can get a report by typing $Output  in our PowerShell window to see a quick report on the space used. Pretty efficient!

And now… a challenge for you

So there you go. Feel free to take that script and try it out (in non-production first, of course) and see what you get. This script is a decent start and gets us the results we wanted, but it still could use some work. Wouldn’t it be great if:

  • You didn’t have to hard code the server name into the script, and
  • You could standardize the “size” values returned (one is in KB and one is in MB… why not make them the same?), and
  • You could export your results to a file

I could write that part into this script, but… I’d like you to try it.

I have plenty of other examples in other blog posts you can read up on to add parameter support to a script like this to turn it into a function, and exporting results (since we already have them in an object collection) is handy too. Think you can do it? If so, leave your results in the comments below.