# Advance
# Shorthand Character Sets
Shorthand | Description |
---|---|
. | Any character except new line |
\w | Matches alphanumeric characters: [a-zA-Z0-9_] |
\W | Matches non-alphanumeric characters: [^\w] |
\d | Matches digits: [0-9] |
\D | Matches non-digits: [^\d] |
\s | Matches whitespace characters: [\t\n\f\r\p{Z}] |
\S | Matches non-whitespace characters: [^\s] |
# Lookarounds
keywords : Lookbehinds, lookaheads, 断言
Symbol | Description |
---|---|
?= | Positive Lookahead |
?! | Negative Lookahead |
?<= | Positive Lookbehind |
?<! | Negative Lookbehind |
- specific types of
non-capturing
groups (used to match a pattern but without including it in the matching list) - Lookarounds are used when a pattern must be
preceded
orfollowed
by another pattern
# Positive Lookahead
?=
keywords: match后面 有 的 ?=xx
- The positive lookahead asserts that the
first
part of the expression must befollowed
by thelookahead
expression (T|t)he(?=\sfat)
means: match either a lowercase t or an uppercase T, followed by the letter he, to matchThe
orthe
only if it'sfollowed
by the wordfat
(\s
= whitespace )"(T|t)he(?=\sfat)" =>
The
fat cat sat on the mat.
# Negative Lookahead
?!
keywords: match后面 没有 的 ?!xx
- Negative lookaheads are used when we need to get
all
matches from an input string that arenot
followed by a certain pattern"(T|t)he(?!\sfat)" => The fat cat sat on
the
mat.
# Positive Lookbehind
?<=
keywords: match 前面 有的 (?<=condition)xx
- Positive lookbehinds are used to get
all
the matches that arepreceded
by a specific pattern (?<=(T|t)he\s)(fat|mat)
means: get allfat
ormat
words from the input string thatcome after
the wordThe
orthe
"(?<=(T|t)he\s)(fat|mat)" => The
fat
cat sat on themat
.
# Negative Lookbehind
?<!
keywords: match 前面 没有的 xx?<!condition
- Negative lookbehinds are used to get
all
the matches that arenot
preceded by a specific pattern (?<!(T|t)he\s)(cat)
means: get allcat
words from the input string that arenot
after the wordThe
orthe
"(?<!(T|t)he\s)(cat)" => The cat sat on
cat
.
# Flags
keywords: modifiers , RegExp
Flag | Description |
---|---|
i | Case insensitive: Match will be case-insensitive |
g | Global Search: Match all instances, not just the first |
m | Multiline: Anchor meta characters work on each line |
# Case Insensitive
/i
"The" =>
The
fat cat sat on the mat.
"/The/gi" =>The
fat cat sat onthe
mat.
# Global Search
/g
/.(at)/g
means: any character except a new line, followed by a lowercase at. it will now findall
matches in the input string,not just
the first one (which is the default behavior)"/.(at)/" => The
fat
cat sat on the mat.
"/.(at)/g" => Thefat
cat
sat
on themat
.
# Multiline
/m
- To perform a multi-line match
"/.at(.)?$/gm" =>
Thefat
catsat
on themat
.
$
"/.at(.)?$/" =>
The fat
cat sat
on themat
.
# Greedy vs Lazy Matching
?
- By
default
, a regex will perform a greedy match, which means the match will beas long as possible
"/(.*at)/" =>
The fat cat sat on the mat
.
- use
?
to match in alazy
way, which means the match should beas short as possible
"/(.*?at)/" =>
The
fat
cat sat on the mat.