Regular Expressions

Regular expressions (or regex) are advanced search strings that allows you to match complex patterns.

In this documentation, the regular expression elements are classified by category.

All the examples listed are used with the two following lines:

Comment from happy_user@company.com (04-Apr-2016):

I love working with Talend! It really helps me with all my daily tasks!

Regular Expressions Examples

Regular Expression	Matches
`\bTa`	Talend
`\bw\w*`	working, with
`\w+n\b`	Preparation
`Talend\s\w+\s\w+`	Talend Data Preparation
`task(s?)`	tasks (it would also match "task")
`\w+@\w+.com`	happy_user@company.com
`\d{2}-.*-\d+`	04-Apr-2016

Anchors

Character	Matches	Example
`^`	Start of string, or start of line in a multi-line pattern	`^Comment` matches "Comment" at the beginning of the line. `^C.*` matches the first line.
`$`	End of string, or end of line in a multi-line pattern	`!$` matches the last exclamation mark.
`\b`	Word boundary	`\bwo` matches the "wo" in "working". `\bwo\w+` matches "working". `ng\b` matches the "ng" in "working". `\w+ng\b` matches "working".
`\B`	Not word boundary	`\Bh` matches the final "h" in "with" but not the "h" in "helps" or "happy". `h\B` matches the first "h" in "helps" and "happy" but not the final one in "with".

Character Classes

Character	Matches	Example
`.`	Any character, except new line (\n)	`.` matches all the characters in the text, except for the carriage return.
`\s`	White space	`Talend\sData` matches "Talend Data". `Data\s+Preparation` matches "Data Preparation".
`\S`	Not white space	`\S` matches all the characters in the sentence, except for the spaces.
`\d`	Digit	`\d{4}` matches "2016".
`\D`	Not digit	`\D` matches all the characters in the text but not the numbers.
`\w`	Word character and underscore	`T\w+`matches "Talend".
`\W`	Not word	`company\Wcom` matches "company.com".
`\n`	New line	`.\n.` matches the whole text.

Escape Characters

Character	Matches
`\.`	.
`\\`	\
`\+`	+
`\*`	*
`\?`	?
`\$`	$
`\[`	[
`\]`	]
`\{`	{
`\}`	}
`\(`	(
`\)`	)
`\\|`	\|
`\/`	/

Groups and Ranges

Character	Matches	Example
`()`	Group	`m(e\|y)` matches "me" and "my".
`(a\|b)`	a or b	`m(e\|y)` matches "me" (in "Comment"), "me" and "my".
`[abc]`	Range (a or b or c)	`m[ey]` matches "me" (in "Comment"), "me" and "my".
`[a-q]`	Letter from a to q	`m[a-m]` matches "me" (in "Comment") and "me" but not "my".
`[0-7]`	Digit from 0 to 7	`201[0-5]` does not match "2016" but would match all years between "2010" and "2015".

The expression captured in a group can be reused using the $ symbol. When more than one group is captured, add a number to the $ symbol, so that it corresponds to the order in which they were captured.

For example, you want to reformulate the expression Y16Q02 that can be matched by the regular expression Y(\d{2})Q(\d{2}). You can then reformulate your original expression only keeping the characters you have captured. If you want your new expression to be Quarter 02 of year 2016, the new regular expression Quarter $2 of year 20$1 will match it.

Quantifiers

Character	Matches	Examples
`*`	0 or more	`work\w*` matches "working" but also "work" and "works".
`+`	1 or more	`work\w+` matches "working" but also "works". However, it does not match "work".
`?`	0 or 1	`work(s?)` matches "work" and "works" but not "working".
`{3}`	Exactly 3	`20\d{2}` matches "2016" and other numbers between "2000" and "2099".
`{3,}`	3 or more	`20\d{2,}` matches "2016" and all numbers superiors or equal to "2000" starting by "20".
`{3,5}`	3, 4 or 5	`20{1,2}` matches "2016" and all numbers from "200" to "2099".
`[0-7]`	Digit from 0 to 7	`201[0-9]` matches "2016" and all numbers from "2010" to "2019".

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!

Leave your feedback here