How to understand the search pattern that regex defines?

How to understand the search pattern that regex defines?

Summary

Regex stands for Regular Expression. It’s written in plain English, and you can search through large amount of data by using characters.

In the following table of contents you can go and learn more about a specific branch of regex.

Table of Contents

Components

    Single Characters
              All from the keyboard.
              except 
              `(^ . [ $ ( ) | * + ? { \)`
              
              // this are operator characters //
              
    Wild Card
              The period by itself  .   
              It matches everything, but \n (newline).

     Bracket Expressions
              Represents a character set. 
              It's an exact target string. [' and ']

     Control characters
              A backslash, \,
              followed by one of the characters 
              `a, b, f, n, r, t, v`

     Escape character sets
              Special sequences set.
              \d is a digit character: [0-9]
              \w is word (program identifier) character: 
              [A-Za-z0-9_]

Anchors

Special sequences that match an empty substring:

      ^ matches at the beginning of the target string
      $ matches at the end of the target string
      \b = word boundary.

Recursive

Any regular expression surrounded by parentheses is an atom:

     ( regular_expression )

Quantifiers

To generate unbounded matching possibilities and other matching amount specifications.

An atom can optionally be followed by one of these quantifiers:

     *   represents 0 or more occurrences of the atom
     +   represents 1 or more occurrence of the atom
     ?   represents 0 or 1 occurrences of the atom
     {n} represents n ocurrences of the atom
     {m,n} represents m and n of the atom

OR Operator characters:

     (^ . [ $ ( ) | * + ? { \)

Character Classes

This defines the type of character.

Flags

Consists of a pattern and optional flags.

     regexp = new RegExp("pattern", "flags");

Grouping and Capturing

A way to treat multiple characters as a single unit.

     (regex) 
     // Allowing to apply regex operators to the entire regex group.

Bracket Expressions

s a list of characters enclosed by:

     [ ' and ' ] 

Greedy and Lazy Match

     'Greedy' means match longest possible string. 
     'Lazy' means match shortest possible string.

Boundaries

Is a position between

     \w and \W (non-word char)

At the beginning or the end of a string if it begins or ends (respectively) with a word character.

Back-references

A backreference in regex identifies a previously matched group and looks for exactly the same text again.

Look-ahead and Look-behind

Also known as “lookaround”

     (?!) - negative lookahead
     (?=) - positive lookahead
     (?<=) - positive lookbehind
     (?<!) - negative lookbehind

     (?>) - atomic group

     bar(?=bar)  finds the 1st bar ("bar" which has "bar" after it)
     bar(?!bar)  finds the 2nd bar ("bar" which does not have "bar" after it)
     (?<=foo)bar finds the 1st bar ("bar" which has "foo" before it)
     (?<!foo)bar finds the 2nd bar ("bar" which does not have "foo" before it)