PowerShell stands out as a versatile tool in scripting and automation, offering a wide array of functionalities to manage systems and process data. One PowerShell feature is the ability to use Regular Expressions (RegEx), a method for searching, matching, and manipulating strings based on specific patterns.
RegEx can be daunting at first glance, but once mastered, it opens up a world of possibilities for text parsing, data validation, and string manipulation. This article explores the fundamentals of RegEx and provides practical examples of how it can be used effectively within PowerShell to enhance your scripts and streamline your workflows.
What is RegEx?
RegEx (Regular Expressions) defines a search pattern using specific character sequences. You can use RegEx to search, replace, or split string operations. RegEx is a powerful tool for matching complex patterns within strings, making it indispensable in scripting, data validation, and text parsing.
RegEx uses literal characters and special characters (known as metacharacters). Literal characters match themselves, while metacharacters are used for more complex matching operations. The following is a list of common metacharacters:
- . (period) – matches any single character except a newline.
- ^ (caret) – assets the position at the start of a line.
- $ (dollar sign) – assets the position at the end of a line.
- * (asterisk) – matches 0 or more occurrences of the preceding element.
- + (plus sign) – matches 1 or more occurrences of the preceding element.
- ? (question mark) – matches any one character within the brackets.
- [] (square brackets) – matches any one character within the brackets.
- | (pipe) – acts as a logical OR between patterns.
- () (parentheses) – groups patterns together.
Here are a few examples of using metacharacters in a regex pattern. The patterns below begin with a forward slash (/) as a delimiter, meaning the slash marks the beginning of the RegEx pattern but is not part of the pattern.
- The pattern
/c.t
matchescat
,cot
, andc4t
because the period can be any character. - The pattern
/.txt$
matchesmyfile.txt
because the string ends with “.txt”, but it does not matchmyfile.txt.old
. - The pattern
/[bcr]at
matchesbat
,cat
, andrat
because “b”, “c”, or “r” can precede the string “at”.
RegEx can have complex matching patterns. To learn more about using RegEx and to test RegEx patterns, check out https://regex101.com.
Using RegEx with PowerShell
PowerShell contains several options for incorporating RegEx in your scripts or functions, like -match
, -replace
, and parameter validation. The following sections outline examples of using these operators in your scripts or functions.
Using Select-String
The cmdlet Select-String
uses RegEx matching to find text patterns in strings and files. It is similar to grep
in Unix or findstr.exe
in Windows. Select-String
is based on lines of text and finds the first match in each line, then displays the file name, line number, and all text in the line. You can also find multiple matches per line, display text before or after the match, or display True
or False
indicating a match is found.
Let’s look at an example of searching a file. Here is a log file with entries starting with “INFO”, “WARN”, or “ERROR”.
INFO: The app did something.
WARN: The app encountered a warning.
INFO: The app did something.
INFO: The app did something.
WARN: The app encountered a warning.
INFO: The app did something.
INFO: The app did something.
ERROR: The app encountered an error.
INFO: The app did something.
INFO: The app did something.
WARN: The app encountered a warning.
INFO: The app did something.
INFO: The app did something.
ERROR: The app encountered an error.
INFO: The app did something.
WARN: The app encountered a warning.
You want to find each line in the file that begins with “ERROR”. First, read the file contents using Get-Content
, then use Select-String
and the pattern ^ERROR
. The caret symbol (^
) means the pattern should only match the beginning of the line. PowerShell finds the matches and outputs to the console.
Get-Content -Path logfile.txt | Select-String -Pattern "^ERROR"
Using -matches and $Matches
The -match
operator finds patterns within strings. If PowerShell finds a match, it returns True
; otherwise, it returns False
. The match pattern is stored in the built-in variable $matches
.
For example, you want to find email addresses that match a specific domain name. An example RegEx to match for contoso.com
would be ^(.*)@(contoso.com)$
:
^
asserts that it should start at the beginning of a line.(*.)
is the first capture group saying to match any character an unlimited number of times.- Literal
@
sign character. (contoso.com)$
is the second capture group with the domain string and occurs at the end of the line.
Testing two different strings, one that matches and one that does not, outputs True
and False
:
Whenever a -match
operator is successful, PowerShell populates the $Matches
automatic variable. The $Matches
variable is a hashtable containing the results of the most recent successful match operation. Outputting $Matches
to the console shows the entire matched string at position 0 followed by any remaining strings by each capture group. In this case, the username and the domain name are in positions 1 and 2, respectively.
You can use the -match
operator against an array of strings. In the example below, $emailAddresses
contains various email addresses with different domains. Use the -match
operator to output only those matching the desired pattern. View the screenshot after the code example to see the matching email addresses.
$emailAddresses = @(
"jeff@contoso.com",
"john@fabrikam.com",
"phyllis@partsunlimited.com",
"michael@fabrikam.com",
"angela@contoso.com",
"andy@partsunlimited.com"
)
$pattern = "^(.*)@(contoso.com)$"
$emailAddresses -match $pattern
PowerShell overwrites the contents of $Matches
each time a successful -match
operation is performed. Here’s an example of using a foreach
to test each item in the array and then output the different matching components.
foreach ($email in $emailAddresses) {
If ($email -match $pattern) {
$output = [PSCustomObject]@{
"Email" = $Matches[0]
"Username" = $Matches[1]
"Domain" = $Matches[2]
}
$output
}
}
Replace text with -replace
Use the -replace
operator with a RegEx pattern to replace text in a string. For example, you are removing any non-digit characters from a phone number. The RegEx pattern \D
matches any character not a digit (equivalent to [^0-9]
pattern). The empty double quote (“”) signifies what to replace the matching text with, in this case, nothing.
$phoneNumber = "(555) 456-7890"
$phoneNumber -replace "\D", "" # Outputs 5554567890
Parameter validation
You can validate the value by using the ValidatePattern()
attribute on a function or script parameter. Validating parameter values this way ensures that data is formatted correctly so that the function or script can execute successfully.
Read: PowerShell Script Parameters: Getting Started Guide | Jeff Brown Tech
The example below shows how to validate that the $EmailAddress
parameter contains a properly formatted email address.
function Send-Message {
param(
[Parameter(Mandatory)]
[ValidatePattern("^[\w\.-]+@[\w\.-]+\.\w+$")]
[string]
$EmailAddress
)
"Sending message to $EmailAddress."
}
PowerShell RegEx Summary
RegEx is a powerful tool in PowerShell that can accomplish many different tasks. This article went over just a few of these capabilities. How do you use RegEx inside PowerShell? Leave a comment below!