I started a new job last year as a Technical Consultant at Perficient, mainly focusing on Lync & Skype for Business deployments. As a result, I’ve been focusing more on blogging over there instead of here. This means that probably anything Lync, Skype, or Office 365 will be posted here in the future. However, I did some PowerShell stuff recently that I figured I could write about, so here it is!

Last week I picked up the latest book from Don Jones and Jeff Hicks titled “The PowerShell Scripting and Toolmaking Book” (get it here at LeanPub). It is a great book for people who have moved past the basics of writing scripts and want to move into making advanced functions. I didn’t work through all the examples yet but can see where many of the topics will be incorporated into my future scripts. I also like the LeanPub model of buying a book once and never paying for new editions in the future. That is always the challenge with technical books, content changes and requires updating, which requires a new edition, which requires a new purchase for the consumer. I have no problem paying a premium for the book if I get new versions for free.

Anyway, moving on. I like efficiency. I will go through a script over and over trying to optimize (in my mind) as much as possible. Reducing duplicate code, creating more functions, etc. I always wonder if I’m choosing the correct logic or using the best commands in order to make my script run as quickly as possible. In the “Scripting at Scale” & “Measuring Tool Performance” chapters, I unfortunately learned that I have probably been hindering some of my scripts’ performance.

I am horribly indecisive when trying to name things, be it a script, variable, new cat, you name it. I usually end up changing my mind and have to find/replace my way through a script to change a variable name. When it comes to iterating through an array, list, or file contents, I always liked to use “ForEach-Object” command because I could refer to the current item as “$_”. That’s right, I like it because I didn’t have to name the variable versus using the “foreach ($item in $items)” type command. In the before mentioned chapters, the authors outline that it is probably better to use “foreach” and gave a few examples of why it is better for performance when writing scripts for large scale use. I wondered how much better could it be, so I had to find out.

I wrote a script to gather all the files under the C: drive and output the file names in three different ways:

  • ForEach-Object
  • foreach
  • pipeline

Here is the code (note: “SilentlyContinue” is only used to suppress errors when trying to access system folders I did not have permission to):

$files = Get-ChildItem C:\ -Recurse -ErrorAction SilentlyContinue
 
for ($i = 1; $i -lt 6; $i++)
{
    $results1 = Measure-Command -Expression {
        $files | ForEach-Object {
            $_.Name
        }
    }
     
    $results2 = Measure-Command -Expression {
        foreach ($file in $files)
        {
            $file.Name
        }
    }
    
    $results3 = Measure-Command -Expression {
        $files | Select-Object Name
    }
     
    $output = [PSCustomObject][ordered]@{
        Iteration = "Test $i"
        ForEachObject = $results1.TotalSeconds
        foreach = $results2.TotalSeconds
        pipeline = $results3.TotalSeconds
    }
 
    Write-Output $output
}

The tests run 5 times. The system I ran this on had approximately 330,000 files. Here are the results (times are in seconds):

To my great surprise, “foreach” was significantly faster that the other options. I now feel bad for all my past scripts where I have forced them to run sub-optimally. If you want to do more thorough testing, Jeff Hicks has a module published in PowerShell Gallery titled Test-Expression for running multiple tests. Happy coding!

Leave a Reply