English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Ruby Regular Expressions

Regular ExpressionIt is a special sequence of characters that matches or searches for a set of strings using a pattern with a specific syntax.

Regular expressions are composed of some predefined special characters and combinations of these characters, forming a "rule string" that is used to express a filtering logic for strings.

Syntax

Regular ExpressionLiterally a pattern between slashes or between any delimiter following %r, as shown below:

 /pattern/
/pattern/im          # Options can be specified
%r!/usr/local! # Use a delimiter in the regular expression

Online Examples

#!/usr/bin/ruby
 
line1 = "Cats are smarter than dogs";
line2 = "Dogs also like meat";
 
if ( line1 =~ /Cats(.*)/ )
  puts "Line1 contains Cats"
end
if ( line2 =~ /Cats(.*)/ )
  puts "Line2 contains    Dogs"
end

The output of the above example is:

Line1 contains Cats

Regular Expression Modifiers

The regular expression may contain an optional modifier on the surface, used to control various aspects of matching. Modifiers are specified after the second slash character, as shown in the example above. The following list shows possible modifiers:

ModifiersDescription
iIgnore case when matching text.
oOnly perform once #{} interpolation, the regular expression makes the judgment at the first time.
xIgnore spaces, allowing whitespace and comments to be placed throughout the expression.
mMatch multiple lines, treating newline characters as normal characters.
u,e,s,nInterpret the regular expression as Unicode (UTF-8)、EUC、SJIS or ASCII. If no modifier is specified, the regular expression is assumed to use the source encoding.

Just like strings are separated by %Q, Ruby allows you to use %r as the beginning of a regular expression, followed by any separator. This is very useful when describing a string with a large number of slashes that you do not want to escape.

# Below matches a single forward slash character, not escaped
%r|/|               
 
# Flag character can be matched by the following syntax
%r[</(.*)>]i

Regular expression pattern

Except control characters,(+ ? . * ^ $ ( ) [ ] { } | \)All other characters match themselves. You can escape control characters by placing a backslash before them.

The table below lists the regular expression syntax available in Ruby.

PatternDescription
^Match the beginning of a line.
$Match the end of a line.
.Match any single character except a newline. When using the m option, it can also match a newline.
[...]Match any single character within square brackets.
[^...]Match any single character not within square brackets.
re*Match the preceding subexpression zero times or more.
re+Match the preceding subexpression once or more.
re?Match the preceding subexpression zero times or once.
re{ n}Match the preceding subexpression n times.
re{ n,}Match the preceding subexpression n times or more.
re{ n, m}Match the preceding subexpression at least n times and at most m times.
a| bMatch a or b.
(re)Group the regular expression and remember the matched text.
(?imx)Temporarily enable the i, m, or x options within the regular expression. If within parentheses, it only affects the part within the parentheses.
(?-imx)Temporarily disable the i, m, or x options within the regular expression. If within parentheses, it only affects the part within the parentheses.
(?: re)Group the regular expression but do not remember the matched text.
(?imx: re)Temporarily enable the i, m, or x options within parentheses.
(?-imx: re)Temporarily disable the i, m, or x options within parentheses.
(?#...)Comment.
(?= re)Use the pattern to specify a position. No range.
(?! re)Use the negative of the pattern to specify a position. No range.
(?> re)Match an independent pattern without backtracking.
\wMatch word characters.
\WMatch non-word characters.
\sMatch whitespace character. Equivalent to [\t\n\r\f].
\SMatch non-whitespace character.
\dMatch digit. Equivalent to [0-9]。
\DMatch non-digit.
\AMatch the beginning of the string.
\ZMatch the end of the string. If there is a newline, only match up to the newline.
\zMatch the end of the string.
\GMatch the last matched dot.
\bWhen outside of parentheses, match word boundary; when inside parentheses, match backspace key (0x08)。
\BMatch non-word boundary.
\n, \t, etc.Match newline, carriage return, tab, etc.
\1...\9Match the nth group subexpression.
\10If it has been matched, then match the nth group subexpression. Otherwise, point to the octal representation of the character encoding.

Regular Expression Examples

character

ExampleDescription
/ruby/Match "ruby"
¥Match Yen symbol. Ruby 1.9 and Ruby 1.8 Supports multiple characters.

Character class

ExampleDescription
/[Rr]uby/Match "Ruby" or "ruby"
/rub[ye]/Matches "ruby" or "rube"
/[aeiou]/Match any lowercase vowel letter
/[0-9]/Match any digit, with /[0123456789]/ Same
/[a-z]/Match any lowercase ASCII letter
/[A-Z]/Match any uppercase ASCII letter
/[a-zA-Z0-9]/Match any character within the parentheses
/[^aeiou]/Match any character that is not a lowercase vowel letter
/[^0-9]/Match any non-digit character

Special character class

ExampleDescription
/./Match any character except a newline
/./mIn multiline mode, can also match newline characters
/\d/Match a digit, equivalent to /[0-9]/
/\D/Match a non-digit, equivalent to /[^0-9]/
/\s/Match a whitespace character, equivalent to /[ \t\r\n\f]/
/\S/Match a non-whitespace character, equivalent to /[^ \t\r\n\f]/
/\w/Match a word character, equivalent to /[A-Za-z0-9_]/
/\W/Match a non-word character, equivalent to /[^A-Za-z0-9_]/

Repetition

ExampleDescription
/ruby?/Match "rub" or "ruby". Among them, y is optional.
/ruby*/Match "rub" plus 0 or more of y.
/ruby+/Match "rub" plus 1 a number or more of y.
/\d{3}/Exactly match 3 a number.
/\d{3,}/Match 3 a number or more.
/\d{3,5}/Match 3 or4 or 5 a number.

Non-greedy repetition

This will match the minimum number of repetitions.

ExampleDescription
/<.*>/Greedy repetition: Match "<ruby>perl>"
/<.*?>/Non-greedy repetition: Match the "<ruby>perl>" within "<ruby>"

Grouping by parentheses

ExampleDescription
/\D\d+/No grouping: + Repeat \d
/(\D\d)+/Grouping: + Repeat \D\d pairs
/([Rr]uby(, )?)+/Matches "Ruby", "Ruby, ruby, ruby", and so on

Backreference

This will match the previously matched group again.

ExampleDescription
/([Rr])uby&\1ails/Matches ruby&rails or Ruby&Rails
/(['"])(?:(?!\1).)*\1/Single or double quoted strings.1 Matches the characters matched by the first group, \2 Matches the characters matched by the second group, and so on.

Replacement

ExampleDescription
/ruby|rube/Matches "ruby" or "rube"
/rub(y|le)/Matches "ruby" or "ruble"
/ruby(!+|\?)/"ruby" followed by one or more ! or followed by a ?

Anchor

This requires specifying the matching position.

ExampleDescription
/^Ruby/Matches a string starting with "Ruby" or a line
/Ruby$/Matches a string ending with "Ruby" or a line
/\ARuby/Matches a string starting with "Ruby"
/Ruby\Z/Matches a string ending with "Ruby"
/\bRuby\b/Matches the word boundary "Ruby"
/\brub\B/\B is a non-word boundary: matches "rube" and "ruby" in "rube" and "ruby", but does not match "rub" alone
/Ruby(?=!)/If there is an exclamation mark after "Ruby", it matches "Ruby"
/Ruby(?!!)/If there is no exclamation mark after "Ruby", it matches "Ruby"

Special syntax of parentheses

ExampleDescription
/R(?#comment)/Matches "R". All remaining characters are comments.
/R(?i)uby/Case-insensitive match for "uby".
/R(?i:uby)/The same as above.
/rub(?:y|le))/Grouping only, without doing \\1 Backreference

Search and replace

sub and gsub and their replacement variables sub! and gsub! are important string methods when using regular expressions.

All these methods are used to perform search and replace operations using regular expression patterns.sub and sub! Replace the first occurrence of the pattern.gsub and gsub! Replace all occurrences of the pattern.

sub and gsub Returns a new string without modifying the original string. sub! and gsub! It will modify the strings they call.

Online Examples

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-
 
phone = "138-3453-1111 # This is a phone number"
 
# Delete Ruby comments
phone = phone.sub!(/#.*$/, ")   
puts "Phone Number : #{phone}"
 
# Remove all characters except numbers
phone = phone.gsub!("/\D/, ")    
puts "Phone Number : #{phone}"

The output of the above example is:

Phone Number : 138-3453-1111 
Phone Number : 13834531111

Online Examples

#!/usr/bin/ruby
# -*- coding: UTF-8 -*-
 
text = "rails is rails, Ruby on Rails is a very good Ruby framework"
 
# Change all "rails" to "Rails"
text.gsub!("rails", "Rails")
 
# Change all the words "Rails" to uppercase
text.gsub!("/\brails\b/, "Rails")
 
puts "#{text}"

The output of the above example is:

Rails is a very good Ruby framework for Ruby on Rails