What is a Regex (Regular expression)
A regular expression, often abbreviated as "regex," is a powerful and versatile pattern-matching tool used in computer science and programming. It's a sequence of characters that defines a search pattern. This pattern can then be used to match, search, and manipulate text strings within larger texts or datasets.
Regular expressions are commonly used for tasks like:
Text Search and Extraction: Regex allows you to find specific strings or patterns within a larger text, which is useful for tasks like searching for keywords or extracting data.
Text Validation: They are used to validate input data, such as email addresses, phone numbers, or other structured formats, ensuring they meet specific criteria.
Text Manipulation: Regex can be used to replace or modify parts of a text based on a specific pattern.
Data Extraction: They can help in parsing structured data from unstructured text, like extracting information from log files or scraping web pages.
Most common used patterns
Regex patterns consist of a combination of literal characters (like letters and digits) and special characters (like *, ?, ., etc.) that define rules for matching certain strings.
Examples:
^
: Matches the start of a line.$
: Matches the end of a line..
: Matches any single character.*
: Matches zero or more occurrences of the previous character or group.+
: Matches one or more occurrences of the previous character or group.?
: Matches zero or one occurrence of the previous character or group.[...]
: Matches any one character from within the brackets.\
: Escapes a special character to treat it as a literal character.\w
: Alpha-numeric only[a-zA-Z]
: literals only (lowercase and uppercase)\d
: digits only[a-z]
: lowercase literal only[A-Z]
: uppercase literal only
Regex Support
Regex is supported in many programming languages like Python, JavaScript, Java, and others, as well as in various text editors and tools.
Tools(checker) and useful links for introduction for Regex
Tools(checker)
https://regexr.com/
https://regex101.com/
Useful links for introduction for Regex
https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference
https://docs.pexip.com/admin/regex_reference.htm
https://www.rexegg.com/regex-quickstart.html
How to test and use the tools(checker)
Regex in URLs
Match the full URL:
Let’s use the example of the URL https://monsido.com/
https:\/\/monsido\.com\/
https:\/\/
matches the "https://" part of the URL. Since slashes are special characters in regex, they need to be escaped with a backslash.monsido\.com
matches the domain name "monsido.com". The dot.
is also a special character in regex, so it needs to be escaped as well.\/
matches the trailing slash after the domain name.
Monsido - Using the path constraint and/or excludes, the user can also use § before the full URL
§https:\/\/monsido\.com\/
To match the end of the URL full string:§https:\/\/monsido\.com\/$
Match the full URL path:
Let’s use the example of the URL https://monsido.com/platform/web-accessibility
The match should be set on /accessibility and not the path before:
\/accessibility
The match should be set on /platform and everything after:
\/platform\/.*
This ensures that the match occurs starting from "/platform" and includes everything that comes after it in the URL path.
Match the URL path with special characters:
Let’s use the example of the URL:https://dagsordner.vordingborg.dk/vis?id=03616c35-c876-4b7f-9125-276812cec2f2
The match should be set on /vis and everything after:\/vis.*
The match should be set on /?id= and everything after:
\/\?id=.*
The match should be set on /vis and close the string:
\/vis