Ruby Basic Tutorial

Ruby Advanced Tutorial

Ruby Database Access DBI Tutorial Ruby Object-Oriented

Ruby Regular Expressions

Regular ExpressionIt is a special sequence of characters that matches or searches for a set of strings using a pattern with a specific syntax.

Regular expressions are composed of some predefined special characters and combinations of these characters, forming a "rule string" that is used to express a filtering logic for strings.

Syntax

Regular ExpressionLiterally a pattern between slashes or between any delimiter following %r, as shown below:

　/pattern/
/pattern/im          # Options can be specified
%r!/usr/local! # Use a delimiter in the regular expression

Online Examples

#!/usr/bin/ruby
　
line1　= "Cats are smarter than dogs";
line2　= "Dogs also like meat";
　
if ( line1　=~　/Cats(.*)/　)
　　puts "Line1　contains Cats"
end
if ( line2　=~　/Cats(.*)/　)
　　puts "Line2　contains    Dogs"
end

The output of the above example is:

Line1　contains Cats

Regular Expression Modifiers

The regular expression may contain an optional modifier on the surface, used to control various aspects of matching. Modifiers are specified after the second slash character, as shown in the example above. The following list shows possible modifiers:

Modifiers	Description
i	Ignore case when matching text.
o	Only perform once #{} interpolation, the regular expression makes the judgment at the first time.
x	Ignore spaces, allowing whitespace and comments to be placed throughout the expression.
m	Match multiple lines, treating newline characters as normal characters.
u,e,s,n	Interpret the regular expression as Unicode (UTF-8)、EUC、SJIS or ASCII. If no modifier is specified, the regular expression is assumed to use the source encoding.

Just like strings are separated by %Q, Ruby allows you to use %r as the beginning of a regular expression, followed by any separator. This is very useful when describing a string with a large number of slashes that you do not want to escape.

# Below matches a single forward slash character, not escaped
%r|/|　　　　　　　　　　　　　　　
　
# Flag character can be matched by the following syntax
%r[</(.*)>]i

Regular expression pattern

Except control characters,(+ ? . * ^ $ ( ) [ ] { } | \)All other characters match themselves. You can escape control characters by placing a backslash before them.

The table below lists the regular expression syntax available in Ruby.

Pattern	Description
^	Match the beginning of a line.
$	Match the end of a line.
.	Match any single character except a newline. When using the m option, it can also match a newline.
[...]	Match any single character within square brackets.
[^...]	Match any single character not within square brackets.
re*	Match the preceding subexpression zero times or more.
re+	Match the preceding subexpression once or more.
re?	Match the preceding subexpression zero times or once.
re{ n}	Match the preceding subexpression n times.
re{ n,}	Match the preceding subexpression n times or more.
re{ n, m}	Match the preceding subexpression at least n times and at most m times.
a\| b	Match a or b.
(re)	Group the regular expression and remember the matched text.
(?imx)	Temporarily enable the i, m, or x options within the regular expression. If within parentheses, it only affects the part within the parentheses.
(?-imx)	Temporarily disable the i, m, or x options within the regular expression. If within parentheses, it only affects the part within the parentheses.
(?: re)	Group the regular expression but do not remember the matched text.
(?imx: re)	Temporarily enable the i, m, or x options within parentheses.
(?-imx: re)	Temporarily disable the i, m, or x options within parentheses.
(?#...)	Comment.
(?= re)	Use the pattern to specify a position. No range.
(?! re)	Use the negative of the pattern to specify a position. No range.
(?> re)	Match an independent pattern without backtracking.
\w	Match word characters.
\W	Match non-word characters.
\s	Match whitespace character. Equivalent to [\t\n\r\f].
\S	Match non-whitespace character.
\d	Match digit. Equivalent to [0-9]。
\D	Match non-digit.
\A	Match the beginning of the string.
\Z	Match the end of the string. If there is a newline, only match up to the newline.
\z	Match the end of the string.
\G	Match the last matched dot.
\b	When outside of parentheses, match word boundary; when inside parentheses, match backspace key (0x08)。
\B	Match non-word boundary.
\n, \t, etc.	Match newline, carriage return, tab, etc.
\1...\9	Match the nth group subexpression.
\10	If it has been matched, then match the nth group subexpression. Otherwise, point to the octal representation of the character encoding.

Regular Expression Examples

character

Example	Description
/ruby/	Match "ruby"
¥	Match Yen symbol. Ruby 1.9 and Ruby 1.8 Supports multiple characters.

Character class

Example	Description
/[Rr]uby/	Match "Ruby" or "ruby"
/rub[ye]/	Matches "ruby" or "rube"
/[aeiou]/	Match any lowercase vowel letter
/[0-9]/	Match any digit, with /[0123456789]/ Same
/[a-z]/	Match any lowercase ASCII letter
/[A-Z]/	Match any uppercase ASCII letter
/[a-zA-Z0-9]/	Match any character within the parentheses
/[^aeiou]/	Match any character that is not a lowercase vowel letter
/[^0-9]/	Match any non-digit character

Special character class

Example	Description
/./	Match any character except a newline
/./m	In multiline mode, can also match newline characters
/\d/	Match a digit, equivalent to /[0-9]/
/\D/	Match a non-digit, equivalent to /[^0-9]/
/\s/	Match a whitespace character, equivalent to /[ \t\r\n\f]/
/\S/	Match a non-whitespace character, equivalent to /[^ \t\r\n\f]/
/\w/	Match a word character, equivalent to /[A-Za-z0-9_]/
/\W/	Match a non-word character, equivalent to /[^A-Za-z0-9_]/

Repetition

Example	Description
/ruby?/	Match "rub" or "ruby". Among them, y is optional.
/ruby*/	Match "rub" plus 0 or more of y.
/ruby+/	Match "rub" plus 1 a number or more of y.
/\d{3}/	Exactly match 3 a number.
/\d{3,}/	Match 3 a number or more.
/\d{3,5}/	Match 3 or4 or 5 a number.

Non-greedy repetition

This will match the minimum number of repetitions.

Example	Description
/<.*>/	Greedy repetition: Match "<ruby>perl>"
/<.*?>/	Non-greedy repetition: Match the "<ruby>perl>" within "<ruby>"

Grouping by parentheses

Example	Description
/\D\d+/	No grouping: + Repeat \d
/(\D\d)+/	Grouping: + Repeat \D\d pairs
/([Rr]uby(, )?)+/	Matches "Ruby", "Ruby, ruby, ruby", and so on

Backreference

This will match the previously matched group again.

Example	Description
/([Rr])uby&\1ails/	Matches ruby&rails or Ruby&Rails
/(['"])(?:(?!\1).)*\1/	Single or double quoted strings.1 Matches the characters matched by the first group, \2 Matches the characters matched by the second group, and so on.

Replacement

Example	Description
/ruby\|rube/	Matches "ruby" or "rube"
/rub(y\|le)/	Matches "ruby" or "ruble"
/ruby(!+\|\?)/	"ruby" followed by one or more ! or followed by a ?

Anchor

This requires specifying the matching position.

Example	Description
/^Ruby/	Matches a string starting with "Ruby" or a line
/Ruby$/	Matches a string ending with "Ruby" or a line
/\ARuby/	Matches a string starting with "Ruby"
/Ruby\Z/	Matches a string ending with "Ruby"
/\bRuby\b/	Matches the word boundary "Ruby"
/\brub\B/	\B is a non-word boundary: matches "rube" and "ruby" in "rube" and "ruby", but does not match "rub" alone
/Ruby(?=!)/	If there is an exclamation mark after "Ruby", it matches "Ruby"
/Ruby(?!!)/	If there is no exclamation mark after "Ruby", it matches "Ruby"

Special syntax of parentheses

Example	Description
/R(?#comment)/	Matches "R". All remaining characters are comments.
/R(?i)uby/	Case-insensitive match for "uby".
/R(?i:uby)/	The same as above.
/rub(?:y\|le))/	Grouping only, without doing \\1 Backreference

Search and replace

sub and gsub and their replacement variables sub! and gsub! are important string methods when using regular expressions.

All these methods are used to perform search and replace operations using regular expression patterns.sub and sub! Replace the first occurrence of the pattern.gsub and gsub! Replace all occurrences of the pattern.

sub and gsub Returns a new string without modifying the original string. sub! and gsub! It will modify the strings they call.

Online Examples

#!/usr/bin/ruby
#　-*-　coding: UTF-8　-*-
　
phone = "138-3453-1111　# This is a phone number"
　
# Delete Ruby comments
phone = phone.sub!(/#.*$/, ")　　　
puts "Phone Number : #{phone}"
　
# Remove all characters except numbers
phone = phone.gsub!("/\D/, ")　　　　
puts "Phone Number : #{phone}"

The output of the above example is:

Phone Number :　138-3453-1111　
Phone Number :　13834531111

Online Examples

#!/usr/bin/ruby
#　-*-　coding: UTF-8　-*-
　
text = "rails is rails, Ruby on Rails is a very good Ruby framework"
　
# Change all "rails" to "Rails"
text.gsub!("rails", "Rails")
　
# Change all the words "Rails" to uppercase
text.gsub!("/\brails\b/, "Rails")
　
puts "#{text}"

The output of the above example is:

Rails is a very good Ruby framework for Ruby on Rails

Ruby Database Access DBI Tutorial Ruby Object-Oriented