# Detail

# The Full Stop

.

  • matches any single character
  • not match return or newline characters
  • .ar means: any character, followed by the letter ar

    ".ar" => The car parked in the garage

# Character Sets

[]

  • Are also calledcharacter classes
  • The order of the character range inside the square brackets doesn't matter
  • [Tt]he means: an uppercase T or lowercase t, followed by the letter he

    "[Tt]he" => The car parked in the garage

  • [.] only means . inside []

    A garage is a good place to park a car.

# Negated Character Sets

[^ ]

  • [^c]ar means: any character except c, followed by the character ar

"[^c]ar" => The car parked in the garage

# Repetitions

+ , * or ?

  • specify how many times a subpattern can occur

# The Star

*
keywords: preceding 前缀 , 0 or more (0表示可以不存在噢)

  • The * symbol matches zero or more repetitions of the preceding matcher

  • a* means: zero or more repetitions of the preceding lowercase character a

  • [a-z]* means: any number of lowercase letters in a row

    "[a-z]*" => The car parked in the garage

    "\scat\s" => The fat cat sat on the concatenation.

  • .* to match any string of characters

  • \s ( whitespace ) to match a string of whitespace characters

  • \s*cat\s* means: zero or more spaces, followed by a lowercase cat, followed by zero or more spaces

# The Plus

+
keywords: 1 or more, preceding , at least (至少出现一次)

  • c.+t means: a lowercase c, followed by at least one character, followed by a lowercase t

    "c.+t" => The fat cat

# The Question Mark

?
keywords: optional, 0 or one

  • makes the preceding character optional
  • [T]?he means: Optional uppercase T, followed by a lowercase he

    "[T]?he" => The car is parked in the garage

# Braces

{ n,m } { n } {n, }
keywords: at least, not more than, exactly n times

  • [0-9]{2,3} means: Match at least 2 digits, but not more than 3, ranging from 0 to 9

    "[0-9]{2,3}" => The number was 9.9997 , 10.0
    "[0-9]{2,}" => The number was 9.9997 , 10.0
    "[0-9]{3}" => The number was 9.9997 , 10.0

# Capturing and Non-Capturing Groups

(...),(?:...)

# Basic

()\1

  • The number left behind () means the order

# Capturing

  • Note that capturing groups do not only match, but also capture
    • if we put a quantifier after a character then it will repeat the preceding character. But if we put a quantifier after a capturing group then it repeats the whole capturing group
    • (ab)* matches zero or more repetitions of the character "ab"

"(c|g|p)ar" => The car is parked in the garage

# Non-Capturing

  • A non-capturing group is a capturing group that matches the characters but does not capture the group
  • (?:c|g|p)ar will not create a capture group
  • 不考虑效率的场合,可以不用非捕获组,以提高正则表达式的可读性

# Alternation

|
keywords : OR

  • character sets work at the character level but alternation works at the expression level
  • (T|t)he|car means: either (an uppercase T or a lowercase t, followed by a lowercase he) OR (a lowercase car)

    "(T|t)he|car" => The car is parked in the garage

# Escaping SP Characters

\
keywords:转义

  • note : put it before your characters
  • e.g \.

# Anchors

^, $
keywords : string, start, end

  • To check if the matching symbol is the starting symbol or ending symbol of the input string

# The Caret

  • ^a match abc
  • ^b not match anything

# The Dollar Sign

  • (at\.)$ means: a lowercase at, followed by a · character and the matcher must be at the end of the string

    "(at.)$" => The fat cat. sat. on the mat.
    "(at.)" => The fat cat. sat. on the mat.

Last Updated: 2022/1/10 下午3:55:08