Regular Expressions (RegEx)
Regular Expressions
Regular expressions (RegEx) are like blueprints for searching and manipulating text. They allow you to find, replace, and validate data with precision. At its core, a regular expression is a sequence of characters and symbols that forms a search pattern.
RegEx is integrated into many tools (like grep and sed) and programming languages.
Basic Syntax and Special Characters
.
(dot): Matches any single character except a newline.*
(asterisk): Matches zero or more occurrences of the preceding character or pattern.+
(plus): Matches one or more occurrences of the preceding character or pattern.?
(question mark): Matches zero or one occurrence of the preceding character or pattern.^
(caret): Matches the beginning of a line.$
(dollar sign): Matches the end of a line.\d
: Matches any digit (0-9).\w
: Matches any word character (letters, digits, or underscores).\s
: Matches any whitespace character (space, tab, or newline).\b
: Matches a word boundary (start or end of a word).
Grouping in RegEx
Grouping lets you combine multiple characters or patterns. There are three common types of brackets used:
()
(Parentheses): Capturing groupsExample:
(abc)+
matches "abc", "abcabc", "abcabcabc".
[]
(Square brackets): Character classesExample:
[aeiou]
matches any vowel.
{}
(Curly brackets): QuantifiersExample:
a{2,4}
matches "aa", "aaa", or "aaaa".
Other Useful Operators in RegEx
OR (
|
): Matches either pattern on the left or right side of|
.Example:
cat|dog
matches "cat" or "dog".
Negation (
[^]
): Matches any character except those inside brackets.Example:
[^0-9]
matches any non-digit character.
Escape Character (
\
): Used to escape special characters like.
or*
.Example:
\.
matches a literal period.
.
Common Use Cases
Validate an email address:
Extract phone numbers:
Find words that start with a capital letter:
Last updated