Ezine 188 - PowerShell's RegexIntroduction to PowerShell Regex - Regular ExpressionTypical jobs for Regex are to find patterns in text, and to replace characters or even whole words. It's often when numbers mix with text that confusion occurs, and that's when you need a PowerShell script to solve the problem. For example, telephone numbers and bank sort codes can be tricky to process because they contain dashes, or a specific grouping of numbers. Keep in mind that it's rare that you would use regex in isolation; therefore, my examples are designed to master this one technique so that you can incorporate pattern recognition in a bigger script. Topics for PowerShell Regex - Regular Expression
Introducing RegexIt all started back in the days when DOS was king. How we loved typing the wildcard * if we wanted to display all files. What PowerShell's regex does is refine 'all' so that you can filter a sub-set of data into the output. It's defining the subset that makes regex so potent, yet so difficult to control unless you are an expert in its logic and its syntax. The problem for beginners is that regex has a bewildering array of whacky syntactic symbols. As a newbie you may find that other people's examples do not make sense, furthermore you realize that if you experiment, then one wrong character and the command will not produce the desired results. My mission is to give you a grounding of the basic structure of regex, from there it will be over to you to employ regex to solve your particular problem. Start with -match Before investigating Regex, it helps if you gain experience by testing comparison operators such as, -match, -like or -contains. Please note that my examples contain variables, while they aren't strictly necessary, $Variables help me to identify different sections of the expression. $Name = "Alan Thomas 1949" $Matches # Command for built-in variable Expected PowerShell result: Note in passing, PowerShell also creates a $Matches variable which I use for troubleshooting unexpected results. PowerShell Regular Expression Examples Using RegexRegex::IsMatch() can be considered the same as -cmatch (case sensitive match). When developers work with regular expressions they prefer to work with Regex::IsMatch(), professionals say that this method is nearer the underlying PowerShell class System.Text.RegularExpressions. However, Guy favours -match or -cmatch as they are shorter and seem to produce exactly the same results. $Name = "Alan Thomas 1949" The expected result is In the above example [Regex] calls for the method IsMatch(). Then it's up to us to supply two values, the input string ($Name) followed by a comma, and the pattern to match ("Alan"). While I have chosen to control the input via a $Variable, you could simplify the expression thus: [Regex]::IsMatch("Alan Thomas 1949","Alan") Basic Regex PunctuationQuotes, or "speech marks" play a key role with Regex. In PowerShell's regular expression constructions it does not matter if you use single or double quotes. The only difference between the example above and the example below is the type of quotes. However, double quotes come into their own when the PowerShell expression contains a $variable that needs expanding; single quotes would treat the $variable as a literal. $Name = 'Alan Thomas 1949' The style of brackets is always significant in PowerShell. IsMatch() requires the rounded parenthesis bracket. Whereas a portion of the search string [a-z] needs a pair of square brackets. If you see pattern matching code with {curly} brackets, they often refer to quantifiers, once again, follow the correct bracket syntax, or else you will get unexpected results. Let us suppose we wanted to test the data for either the name Alan or Alun. Here is a simple pattern matching example where the third letter can be either 'a' or 'u'. Another method would be to employ the period '.', however that would require "Al.n", and not "Al[.]n". Other uses of this technique are if you want to check for a number, for example [0-9]. In this case we use the dash to tell PowerShell to expect a contiguous range of all numbers from zero to nine. $Name = "Alan Thomas 1949" Use of + At first I could not see the point of incorporating the + symbol in regex expressions, but then I had a particular problem, some people spell their name Allan. How could I cope with this double ll? The answer was to insert a plus into the pattern, thus: $Name = "Allan Thomas 1949" Summary, I now have a pattern that finds Alun, Alan and Allan. Without the + it's particularly difficult to find Allan as it contains 5 letters whereas Alan and Alun only have 4 letters. + means 1-n matches (* means 0-n matches). To see the power, and point of this symbol, try removing the plus. Backslash has several roles in regex, firstly to introduce special characteristics such as anchors like \b (word boundary). The backslash is also used to introduce literals, for example the period '.' in the IP example below. In terms of pattern matching, \ can also be used as a escape character, for example \s means whitespace and not the letter 's'. Here is an example which employs \d (decimal) to match the basic format of an IP address, however, it does not test for numbers bigger than 254. $ipaddress = "192.168.10.10" \d tests for a number (as opposed to a letter), while these curly brackets {1,3} mean containing 1, 2, or 3 digits. Guy Recommends:
The Free IP Address Tracker (IPAT)
| |||||
Custom Search
|
Guy Recommends: WMI Monitor and It's Free!
|
|
Home Copyright © 1999-2012 Computer Performance LTD All rights reserved Please report a broken link, or an error. | |