Introduction to
PowerShell Regex - Regular Expression
Typical jobs for Regex are to match patterns in text, and to replace
individual characters or even whole words. It's often when numbers mix
with text that confusion occurs, and that's when you need a PowerShell
script to solve the problem. For example, telephone numbers or bank
sort codes can be tricky to process because they contain dashes, or a
specific grouping of numbers.
Keep in mind that it's rare that you would use regex in
isolation; therefore, my examples are designed to master this one technique so
that you can incorporate pattern recognition in a bigger script.
It all started back in the days when DOS was king. How we loved
typing the wildcard * if we wanted to display all files. What PowerShell's
regex
does is refine 'all' so that you can filter a sub-set of data
into the output. It's defining the subset that makes regex so potent,
yet so difficult to control unless you are an expert in its logic and its
syntax.
The problem for beginners is that regex has a bewildering array of whacky
syntactic symbols.
As a newbie you may find that other people's examples do not make sense, furthermore you
realize that if you experiment, then one wrong character and the command will not
produce the desired results. My mission is to give you a grounding of
the basic structure of regex, from there it will be over to you to employ regex to solve your particular problem.
Start with PowerShell Regex -Match Before investigating Regex, it helps if you gain experience by testing
comparison operators such as, -Match, -Like or -Contains. Please note
that my examples contain variables, while they aren't strictly necessary,
$Variables help me to identify different sections of the expression.
# Example of PowerShell $Matches $Name = "Alan Thomas 1949" $Name -Match "Alan" $Matches #
Built-In variable
Expected PowerShell result: True
Note 1: In passing, The PowerShell $Matches variable
is created automatically, I use it
for troubleshooting unexpected results.
Regex::IsMatch() can be considered the same as -cmatch (case
sensitive match). When developers work with regular expressions they prefer to work with Regex::IsMatch(),
professionals say that this method is nearer the underlying PowerShell class System.Text.RegularExpressions.
However, Guy favours -Match or -cmatch as they are shorter and seem to produce exactly
the same results.
$Name = "Alan Thomas 1949" [Regex]::IsMatch($Name,"Alan")
The expected result is True
In the above example [Regex] calls for the method IsMatch(). Then
it's up to us to
supply two values, the input string ($Name) followed by a comma, and the
pattern to match ("Alan"). While I have chosen to control the input
via a $Variable, you could simplify the expression thus:
[Regex]::IsMatch("Alan Thomas 1949","Alan")
Guy Recommends: A Free Trial of the Network Performance Monitor
(NPM)
SolarWinds'
Network Performance Monitor
will help you discover what's happening on your network. This
utility will also guide you through troubleshooting; the dashboard will
indicate whether the root cause is a broken link, faulty equipment or
resource overload.
What I like best is the way NPM suggests solutions to network
problems. Its also has the ability to monitor the health of individual VMware
virtual machines. If you are interested in troubleshooting, and creating
network maps, then I recommend that you try NPM now.
Quotes, or "speech marks" play a key role with
Regex. In PowerShell's regular expression constructions it does not
matter if you use single or double quotes. The only difference between
the example above and the example below is the type of quotes.
However, double quotes
come into their own when the PowerShell expression contains a $variable that needs
expanding; single quotes would treat the $variable as a literal.
$Name = 'Alan Thomas 1949' [Regex]::IsMatch($Name,'Alan')
The style of brackets is always significant in PowerShell. IsMatch()
requires the rounded parenthesis bracket. Whereas a portion of the
search string [a-z] needs a pair of square brackets. If you see
pattern matching code with {curly} brackets, they often refer to
quantifiers, once again, follow the correct bracket syntax, or else you will
get unexpected results.
Let us suppose we wanted to test the data for either the name Alan or Alun.
Here is a simple pattern matching example where the third letter can be either
'a' or 'u'. Another method would be to employ the period '.', however
that would require "Al.n",
and not "Al[.]n". Other uses of this technique are if you want to check for
a number, for example [0-9]. In this case we use the dash to tell PowerShell
to expect a contiguous range of all numbers from zero to nine.
$Name = "Alan Thomas 1949" [Regex]::IsMatch($Name,"Al[au]n")
Use of + At first, I could not see the point of incorporating the + symbol in regex
expressions, but then I had a particular problem, some people spell their
name Allan. How could I cope with this double ll? The answer was
to insert a plus into the pattern, thus:
$Name = "Allan Thomas 1949" [Regex]::IsMatch($Name,"Al+[au]n")
Summary, I now have a pattern that finds Alun, Alan and
Allan. Without the + it's particularly difficult to find Allan as it
contains 5 letters whereas Alan and Alun only have 4 letters. + means
1-n matches (* means 0-n matches). To see the power, and point of this
symbol, try removing the plus.
Backslash has several roles in regex, firstly to introduce
special characteristics such as anchors like \b (word boundary). The
backslash is also used to introduce literals, for example the period '.' in
the IP example below. In terms of pattern matching, \ can also be used as a
escape character, for example \s means whitespace and not the letter 's'.
Here is an example which employs \d (decimal) to match the basic format
of an IP address, however, it does not test for numbers bigger than 254.
# Example of PowerShell Regex Match $ipaddress = "192.168.10.10" $ipaddress -Match
"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
\d tests for a number (as opposed to a letter), while these curly
brackets {1,3} mean containing 1, 2, or 3 digits.
Guy
Recommends: Free WMI Monitor for PowerShell
Windows Management Instrumentation (WMI) is one of the hidden
treasures of Microsoft's operating systems. Fortunately, SolarWinds
have created a
Free WMI Monitor so that you can discover these gems of performance
information, and thus improve your PowerShell scripts.
Take the guess work out of which WMI counters to use when scripting the
operating system, Active Directory, or Exchange Server. Give this WMI monitor a
try - it's free.
Matching the Beginning and End of Strings - Anchors
In many countries Thomas could be a first name or a surname, if we wanted
to search for only a last name of "Thomas" then we would append the $, thus
we need Thomas$. Naturally, we have to assume that the surname
would be at the very end of the data input.
# Example of PowerShell Regex Match $strText = "The man called Mr Grey wore the big red coat."
$Pattern = "the" $matched = [regex]::matches($strText,
$pattern)
"Result of using the match method, we get the following:" $matched | format-table index, length, value
-auto
Note: This example will only find one instance of 'the'. To make the
search case insensitive try introduce (?i) before 'the', thus: $Pattern = "(?i)the". As a result you should find two
instances of 'the'.
Alternative Match Techniques
The purpose is to check that you have a single block of text with no spaces.
# Example of PowerShell Regex Match $Block = [regex]"^[A-Za-z0-9]*$" $Block.match("GuyThomas") | `
format-table Value, success, Length -auto
The block [A-Za-z0-9] caters for upper [A-Z] and lower case [a-z] letter
and numbers [0-9].
Guy Recommends: SolarWinds Engineer's Toolset v10
This
Engineer's Toolset v10 provides a comprehensive console of 50 utilities
for troubleshooting computer problems. Guy says it helps me
monitor what's occurring on the network, and each tool teaches me more about how the
underlying system operates.
There are so many good gadgets; it's like having free rein of a
sweetshop. Thankfully the utilities are displayed logically: monitoring,
network discovery, diagnostic, and Cisco tools. Try the SolarWinds Engineer's Toolset now!
#Example of PowerShell Regex Replace $strText = "The man wearing the gray overcoat" $Pattern = "Gray"
$New = "Grey" $strReplace = [regex]::replace($strText, $pattern, "$New")
"We will now replace $Pattern with $New :" $strReplace
The key command here is replace, as in [Regex]::replace. Observe
how replace has three arguments, the input text, the pattern to search for
and finally, the pattern to replace.
Notice in passing that because we employ the double quotes PowerShell
expands the variables $Pattern and $New.
Review of PowerShell's Regular Expressions
Regular expressions are different from DOS's wildcards. Even where
they contain the same symbols as command-line syntax, PowerShell's [Regex]
doesn't mean the same thing. They’re just similar enough to cause confusion.
Regular expressions are in a netherworld, while they aren't easy, neither
are they the hardest part of PowerShell. With a little effort you can
master [Regex
Think of regular expressions as a programming language of their
own. One analogy I find useful is that of a foreign language, such as
a French quotation inside an English text. Thus don’t expect the rules
from PowerShell to have any relation to [Regex], any more than you’d expect
English grammar to apply inside that French quote.
Regex is bigger and better than the old DOS * wildcard. The
only problem is that the increased ability to control regular expressions
brings greater complexity for the beginner. As ever, my advice is to
start slowly, choose a simple example and then build on success. The
key to mastering regex is to understand the syntax.
If you like this page then please share it with your friends
Please email me if you have a better example script. Also please report any factual mistakes, grammatical errors or broken links, I will be happy to correct the fault.
Windows Management Instrumentation (WMI) is
most useful for PowerShell scripting.
SolarWinds
have produced this
Free WMI Monitor to take the guess work out of which
WMI counters to use for applications like Microsoft Active Directory,
SQL or Exchange Server.