Regular Expressions for Passphrases

As a follow up to our Regular Expressions for password complexity blog post, we’ll look at some of the more interesting things you can do with regular expressions. These are typically (but not necessarily) used in organizations using passphrase and therefore not leveraging the full set of rules you might use for shorter passwords.

More Building Blocks

For the examples that follow, we’ll need some additional building blocks not covered in part 1 of this post:

Look-Ahead

These are operators used for matching a set of consecutive characters within a regex.  There is a positive look-ahead, which will look for the string “abc” starting from the current position in the string moving to the right.

(?=abc)

There is also negative look-ahead, which will match any string that is not “abc”:

(?!abc)

Note for both examples, the look-ahead is enclosed in parenthesis.  In other regex engines you’ll also see similar look-back operators that operate the same way but searching from right to left; these are not available in the regex engine used by Specops Password Policy.

Repetition

The repetition operators are used to match something previously matched.  As a simple example, let’s just match on a single character using a wildcard

.

Not terribly interesting on its own.  But what if I wanted to identify where the character that the wildcard found was used more than once in a row:

(.)\1

Enclosing it parenthesis and then using \1 to reference the thing just found in the parenthesis, I can now match on any two consecutive repeating characters.

Blocking Dictionary Words using Regular Expressions

One of the more common customer requests we see is the ability to exclude certain words or strings from passphrases.  The Specops passphrase rules by design do not include a full dictionary engine, but many organizations still wish to exclude a few key words such as the company name or, as we’ll show here, the word “password”:

^(?!.*password).*$

^ (?! .* password ) .* $
Begin Negative look-ahead Anything (or nothing) Matches the string ‘password’ Closing the (?! From earlier Again, anything nor nothing End

Pay careful attention to how and where we placed the .* wildcards.  The .* is known as a “greedy” operator, which means it’ll match anything it finds unless you use parenthesis and look-aheads to set limits on where it searches.  In this regex we use .* operator twice.  The first is inside the look-ahead to capture any characters before we see the “password” string.  Once we see “password” we stop there, then the second .* takes over to account for anything between there and the end of the string $.  If we don’t contain the .* operators, we run the risk of it matching the entire string (including where the word “password” appears) and causing unexpected results.

The next step in building our regex is to make this case insensitive, because we want to block password and Password and PASSWORD and so on.  We don’t support the /i operator you might see in some regex engines; however we can accomplish the same effect with some creative use of character sets:

[pP][aA][sS][sS][wW][oO][rR][dD]

Once we’ve built this, we can also throw in some common leet-speak substitutions, because we really don’t want P@ssw0rd or Pa$$word in our passphrases either:

[pP][aA@][sS$][sS$][wW][oO0][rR][dD]

So now that we’ve built this block, it’s just a matter of plugging it into the case-sensitive regex we built earlier:

^(?!.*[pP][aA@][sS$][sS$][wW][oO0][rR][dD]).*$

Blocking Consecutive Identical Characters

Recreating the ‘block identical consecutive characters’ rule from the password rules tab to apply to passphrases.  Again the basic pieces are the same as the dictionary regex, but instead of a word (or the series of char-sets) we use the \1 operator to identify repeating characters:

^(?!.*(.)\1\1).*$

Passphrase Formatting

A customer wanted to use regex to mandate passphrases were a series of 3 words that are at least 6 characters long, separated by spaces.

First, let’s find a way to match each 6+ character word:

\w{6,}

Remember from part 1, \w represents a word character (lowercase, uppercase, digits, or underscore), and the {6,} says “6 or more of those.”  We could’ve used [a-zA-Z] instead ove \w, but the customer didn’t want to be too restrictive or prevent passphrases if a user thought of a reason to include a number.  Plus, the \w looks cleaner.

So now, we need a 6+ character word followed by at least one space:

\w{6,}\s+

With \s representing any whitespace and the + saying “at least one of those.”  So now, just stack up our words and spaces until we get the word-space-word-space-word pattern we’re after:

^\w{6,}\s+\w{6,}\s+\w{6,}$

This will only allow upper and lowercase characters within the three words. If you do want to allow the use of numeric, special, upper or lowercase characters in your passphrase all you need to do is swap the /w for /S i.e.

^\S{6,}\s+\S{6,}\s+\S{6,}$

(Last updated on December 6, 2018)

Tags: , ,

darren siegel

Written by

Darren Siegel

Darren Siegel is a cyber security expert at Specops Software. He works as a lead IT engineer, helping organizations solve complex challenges within IT security. Darren has more than 15 years’ experience within Active Directory, IT security, servers, storage, virtualization, cloud, and identity and access management.

Back to Blog