Regular Expression

Anchors:
^            Start of string    
\A           Start of string
$            End of string
\Z           End of string 
\b           Word boundary
\B           Not word boundary
\<           Start of word
\>           End of word

Character Classes:
\c           Control character
\s           White space
\S           Not white space
\d           Digit
\D           Not digit
\w           Word
\W           Not word
\x           Hexadecimal digit
\O           Octal digit

POSIX:
[:upper:]    Upper case letters
[:lower:]    Lower case letters
[:alpha:]    All letters
[:alnum:]    Digits and letters
[:digit:]    Digits
[:xdigit:]   Hexadecimal digits
[:punct:]    Punctuation
[:blank:]    Space and tab
[:space:]    Blank characters
[:cntrl:]    Control characters
[:graph:]    Printed characters
[:print:]    Printed characters and spaces
[:word:]     Digits, letters and underscore

Assertions:
?=          Lookahead assertion
?!          Negative lookahead
?<=         Lookbehind assertion
?!= or ?<!  Negative lookbehind
?>          Once-only Subexpression
?()         Condition [if then]
?()|        Condition [if then else]
?#          Comment

Quantifiers:
*           0 or more
+           1 or more
!           Do not match the next character or regular expression.
{3}         Exactly 3
{3,}        3 or more
{3,5}       3, 4 or 5

Quantifier Modifiers:
"x" below represents a quantifier
x?        Ungreedy version of "x"

Ecape Character:
\         Escape Character

Metacharacters (must be escaped):
^     $      (        )       <
.     *      +        ?       [
{     \      |        >

Special Characters:
\n        New line
\r        Carriage return
\t        Tab
\v        Vertical tab
\f        Form feed
\xxx      Octal character xxx
\xhh      Hex character hh

Groups and Ranges:
.         Any character except new line (\n)
(a|b)     a or b
(...)     Group
(?:...)   Passive Group
[abc]     Range (a or b or c)
[^abc]    Not a or b or c
[a-m]     Letter between a and m
[A-M]     Upper case letter between A and M
[0-5]     Digit between 0 and 5
\n        nth group/subpattern
Note: Ranges are inclusive.

Pattern Modifiers:
g         Global match
i         Case-insensitive
m         Multiple lines
s         Treat string as single line
x         Allow comments and white space in pattern
e         Evaluate replacement
U         Ungreedy pattern

String Replacement (Backreferences):
$n        nth non-passive group
$2        "xyz" in /^(abc(xyz))$/
$1        "xyz" in /^(?:abc)(xyz)$/
$`        Before matched string
$'        After matched string
$+        Last matched string
$&        Entire matched string

Sample Patterns:
Pattern                                     Will Match
([A-Za-z0-9-]+)                             Letters, numbers and hyphens
(\d{1,2}\/\d{1,2}\/\d{4})                   Date (e.g. 21/3/2006)
([^\s]+(?=\.(jpg|gif|png))\.\2)             jpg, gif or png image
(^[1-9]{1}$|^[1-4]{1}[0-9]{1}$|^50$)        Any number from 1 to 50 inclusive
(#?([A-Fa-f0-9]){3}(([A-Fa-f0-9]){3})?)     Valid hexadecimal colour code
((?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,15})     String with at least one upper case letter, one lower
                                            case letter, and one digit (useful for passwords).
(\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,6})            Email addresses
(\<(/?[^\>]+)\>)                            HTML Tags

Note: These patterns are intended for reference purposes and have not been extensively tested. 
Please use with caution and test thoroughly before use.

Post to Twitter Post to Digg Post to Facebook Post to Google Buzz Send Gmail

2 Comments

  1. avatarbill

    Great info about to make a ‘Regular Expression’.

  2. avatarKashif

    Great!, i am searching a class that parse the word resume/cv document,
    Reply will be highly appreciable,
    Best regards

Leave a Comment

Your email address will not be published. Required fields are marked *