To create a regular expression, you must use specific syntax—that is, special characters and construction rules. For example, the following is a simple regular expression that matches any 10-digit telephone number, in the pattern nnn-nnn-nnnn:
\d{3}-\d{3}-\d{4}
The following table describes some of the most common special characters for use in regular expressions. These characters are categorized as follows:
| Characters | Description |
|---|---|
| Anchors | |
| ^ | (caret) Matches the start of the line or string of text that the regular expression is searching. For example, a content rule with a location Subject line and the following regular expression: ^abc captures any email message that has a subject line beginning with the letters abc |
| $ | (dollar) Matches the end of the line or string of text that the regular expression is searching. For example, a content rule with a location Subject line and the following regular expression: xyz$ captures any email message that has a subject line ending with the letters xyz |
| Metacharacters | |
| . | (dot) Matches any single character, except a new line. |
| | | (pipe) Indicates alternation—that is, an "or." For example: cat|dog matches the word cat or dog |
| \ | Indicates that the next character is a literal rather than a special character. For example: \. matches a literal period, rather than any character (dot character) |
| Character Classes | |
| [...] | Matches any character from a set of characters. Separate the first and last character in a set with a dash. For example: [123] matches the digit 1, 2, or 3 [a-f] matches any letter from a to f Note: Regular expressions in Content Compliance policies are case sensitive. |
| [^...] | Matches any character not in the set of characters. For example: [^a-f]matches any character that's not a letter from a to f Note: Regular expressions in Content Compliance policies are case sensitive. |
| [:alnum:] | Matches alphanumeric characters (letters or digits): a-z, A-Z, or 0-9 Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:alnum:]]. |
| [:alpha:] | Matches alphabetic characters (letters): a-z or A-Z Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:alpha:]]. |
| [:digit:] | Matches digits: 0-9 Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:digit:]]. |
| [:graph:] | Matches visible characters only—that is, any characters except spaces, control characters, and so on. Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:graph:]]. |
| [:punct:] | Matches punctuation characters and symbols: ! " # $ % & ' ( ) * + , \ -. / : ; < = > ? @ [ ] ^ _ ` { | } Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:punct:]]. |
| [:print:] | Matches visible characters and spaces. Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:print:]]. |
| [:space:] | Matches all whitespace characters, including spaces, tabs, and line breaks. Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:space:]]. |
| [:word:] | Matches any word character—that is, any letter, digit, or underscore: a-z, A-Z, 0-9, or _ Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:word:]]. |
| Shorthand Character Classes | |
| \w | Matches any word character—that is, any letter, digit, or underscore: a-z, A-Z, 0-9, or _ Equivalent to [:word:] |
| \W | Matches any non-word character—that is, any character that's not a letter, digit, or underscore. Equivalent to [^[:word:]] |
| \s | Matches any whitespace character. For example, use this character to specify a space between words in a phrase: stockstips matches the phrase stock tips Equivalent to [:space:] |
| \S | Matches any character that's not a whitespace. Equivalent to [^[:space:]] |
| \d | Matches any digit from 0-9. Equivalent to [:digit:] |
| \D | Matches any character that's not a digit from 0-9. Equivalent to [^[:digit:]] |
| Group | |
| (...) | Groups parts of an expression. Use grouping to apply a quantifier to a group or to match a character class before or after a group. |
| Quantifiers | |
| {n} | Match the preceding expression exactly n times. For example: [a-c]{2} matches any letter from a to c only if two letters occur in a row. Thus, the expression would match ab and bc but not abc or aabbc. |
| {n,m} | Match the preceding expression a minimum of n times and a maximum of m times. For example: [a-c]{2,4} matches any letter from a to c only if the letters occur a minimum of 2 times and a maximum of 4 times in a row. Thus, the expression would match ab and abc but not aabbc. |
| ? | Indicates that the preceding character or expression can match 0 or 1 times. Equivalent to the range {0,1}. For example, the following regular expression: colou?r matches either colour or color, because the ? makes the letter u optional. |