# Detail
# The Full Stop
.
- matches
any
single character not
match return or newline characters.ar
means:any
character, followed by the letterar
".ar" => The
car
par
ked in thegar
age
# Character Sets
[]
- Are also calledcharacter classes
- The
order
of the character rangeinside
the square bracketsdoesn't
matter [Tt]he
means: an uppercase Tor
lowercase t, followed by the letterhe
"[Tt]he" =>
The
car parked inthe
garage[.]
only means.
inside[]
A garage is a good place to park a c
ar.
# Negated Character Sets
[^ ]
- [^c]ar means: any character
except
c, followed by the character ar
"[^c]ar" => The car
par
ked in thegar
age
# Repetitions
+
, *
or ?
- specify how many
times
a subpattern can occur
# The Star
*
keywords: preceding 前缀 , 0 or more (0表示可以不存在噢)
The
*
symbol matcheszero
ormore
repetitions of the preceding matchera*
means:zero
or more repetitions of the preceding lowercase character a[a-z]*
means:any
number of lowercase letters in a row"[a-z]*" => T
he car parked in the garage
"\scat\s" => The fat
cat
sat on the concat
enation..*
to matchany string
of characters\s
( whitespace ) to match a string ofwhitespace
characters\s*cat\s*
means: zero or more spaces, followed by a lowercasecat
, followed by zero or morespaces
# The Plus
+
keywords: 1 or more, preceding , at least (至少出现一次)
c.+t
means: a lowercase c, followed byat least
one character, followed by a lowercase t"c.+t" => The fat
cat
# The Question Mark
?
keywords: optional, 0 or one
- makes the preceding character optional
[T]?he
means: Optional uppercase T, followed by a lowercase he"[T]?he" =>
The
car is parked in the
garage
# Braces
{ n,m }
{ n }
{n, }
keywords: at least, not more than, exactly n times
[0-9]{2,3}
means: Matchat least
2 digits, butnot
more than 3,ranging
from 0 to 9"[0-9]{2,3}" => The number was 9.
999
7 ,10
.0
"[0-9]{2,}" => The number was 9.9997
,10
.0
"[0-9]{3}" => The number was 9.999
7 , 10.0
# Capturing and Non-Capturing Groups
(...)
,(?:...)
# Basic
()\1
- The number left behind
()
means theorder
# Capturing
- Note that capturing groups do not only
match
, but alsocapture
- if we put a quantifier after a character then it will repeat the
preceding
character. But if we put a quantifier after acapturing group
then it repeats thewhole
capturing group (ab)*
matches zero or more repetitions of the character "ab"
- if we put a quantifier after a character then it will repeat the
"(c|g|p)ar" => The
car
ispar
ked in thegar
age
# Non-Capturing
- A non-capturing group is a capturing group that
matches
the characters butdoes not capture
the group (?:c|g|p)ar
will not create a capture group- 不考虑效率的场合,可以不用非捕获组,以提高正则表达式的可读性
# Alternation
|
keywords : OR
- character sets work at the
character level
but alternation works at theexpression level
(T|t)he|car
means: either (an uppercase T or a lowercase t, followed by a lowercase he) OR (a lowercase car)"(T|t)he|car" =>
The
car
is parked inthe
garage
# Escaping SP Characters
\
keywords:转义
- note : put it
before
your characters - e.g
\.
# Anchors
^
, $
keywords : string, start, end
- To check if the matching symbol is the
starting
symbol orending
symbol of the input string
# The Caret
^a
matchabc
^b
not match anything
# The Dollar Sign
(at\.)$
means: a lowercase at, followed by a·
character and the matcher must be at theend
of the string"(at.)$" => The fat cat. sat. on the m
at.
"(at.)" => The fat cat.
sat.
on the mat.