# Advance

# Shorthand Character Sets

Shorthand Description
. Any character except new line
\w Matches alphanumeric characters: [a-zA-Z0-9_]
\W Matches non-alphanumeric characters: [^\w]
\d Matches digits: [0-9]
\D Matches non-digits: [^\d]
\s Matches whitespace characters: [\t\n\f\r\p{Z}]
\S Matches non-whitespace characters: [^\s]

# Lookarounds

keywords : Lookbehinds, lookaheads, 断言

Symbol Description
?= Positive Lookahead
?! Negative Lookahead
?<= Positive Lookbehind
?<! Negative Lookbehind
  • specific types of non-capturing groups (used to match a pattern but without including it in the matching list)
  • Lookarounds are used when a pattern must be preceded or followed by another pattern

# Positive Lookahead

?=
keywords: match后面 有 的 ?=xx

  • The positive lookahead asserts that the first part of the expression must be followed by the lookahead expression
  • (T|t)he(?=\sfat) means: match either a lowercase t or an uppercase T, followed by the letter he, to match The or the only if it's followed by the word fat ( \s = whitespace )

    "(T|t)he(?=\sfat)" => The fat cat sat on the mat.

# Negative Lookahead

?!
keywords: match后面 没有 的 ?!xx

  • Negative lookaheads are used when we need to get all matches from an input string that are not followed by a certain pattern

    "(T|t)he(?!\sfat)" => The fat cat sat on the mat.

# Positive Lookbehind

?<=
keywords: match 前面 有的 (?<=condition)xx

  • Positive lookbehinds are used to get all the matches that are preceded by a specific pattern
  • (?<=(T|t)he\s)(fat|mat) means: get all fat or mat words from the input string that come after the word The or the

    "(?<=(T|t)he\s)(fat|mat)" => The fat cat sat on the mat.

# Negative Lookbehind

?<!
keywords: match 前面 没有的 xx?<!condition

  • Negative lookbehinds are used to get all the matches that are not preceded by a specific pattern
  • (?<!(T|t)he\s)(cat) means: get all cat words from the input string that are not after the word The or the

    "(?<!(T|t)he\s)(cat)" => The cat sat on cat.

# Flags

keywords: modifiers , RegExp

Flag Description
i Case insensitive: Match will be case-insensitive
g Global Search: Match all instances, not just the first
m Multiline: Anchor meta characters work on each line

# Case Insensitive

/i

"The" => The fat cat sat on the mat.
"/The/gi" => The fat cat sat on the mat.

/g

  • /.(at)/g means: any character except a new line, followed by a lowercase at. it will now find all matches in the input string, not just the first one (which is the default behavior)

    "/.(at)/" => The fat cat sat on the mat.
    "/.(at)/g" => The fat cat sat on the mat.

# Multiline

/m

  • To perform a multi-line match

"/.at(.)?$/gm" =>
The fat
cat sat
on the mat.

  • $

"/.at(.)?$/" =>
The fat
cat sat
on the mat.

# Greedy vs Lazy Matching

?

  • By default, a regex will perform a greedy match, which means the match will be as long as possible

"/(.*at)/" => The fat cat sat on the mat.

  • use ? to match in a lazy way, which means the match should be as short as possible

"/(.*?at)/" => The fat cat sat on the mat.

Last Updated: 2022/1/10 下午3:55:08