English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Regular ExpressionIt is a special sequence of characters that matches or searches for a set of strings using a pattern with a specific syntax.
Regular expressions are composed of some predefined special characters and combinations of these characters, forming a "rule string" that is used to express a filtering logic for strings.
Regular ExpressionLiterally a pattern between slashes or between any delimiter following %r, as shown below:
/pattern/ /pattern/im # Options can be specified %r!/usr/local! # Use a delimiter in the regular expression
#!/usr/bin/ruby line1 = "Cats are smarter than dogs"; line2 = "Dogs also like meat"; if ( line1 =~ /Cats(.*)/ ) puts "Line1 contains Cats" end if ( line2 =~ /Cats(.*)/ ) puts "Line2 contains Dogs" end
The output of the above example is:
Line1 contains Cats
The regular expression may contain an optional modifier on the surface, used to control various aspects of matching. Modifiers are specified after the second slash character, as shown in the example above. The following list shows possible modifiers:
Modifiers | Description |
---|---|
i | Ignore case when matching text. |
o | Only perform once #{} interpolation, the regular expression makes the judgment at the first time. |
x | Ignore spaces, allowing whitespace and comments to be placed throughout the expression. |
m | Match multiple lines, treating newline characters as normal characters. |
u,e,s,n | Interpret the regular expression as Unicode (UTF-8)、EUC、SJIS or ASCII. If no modifier is specified, the regular expression is assumed to use the source encoding. |
Just like strings are separated by %Q, Ruby allows you to use %r as the beginning of a regular expression, followed by any separator. This is very useful when describing a string with a large number of slashes that you do not want to escape.
# Below matches a single forward slash character, not escaped %r|/| # Flag character can be matched by the following syntax %r[</(.*)>]i
Except control characters,(+ ? . * ^ $ ( ) [ ] { } | \)All other characters match themselves. You can escape control characters by placing a backslash before them.
The table below lists the regular expression syntax available in Ruby.
Pattern | Description |
---|---|
^ | Match the beginning of a line. |
$ | Match the end of a line. |
. | Match any single character except a newline. When using the m option, it can also match a newline. |
[...] | Match any single character within square brackets. |
[^...] | Match any single character not within square brackets. |
re* | Match the preceding subexpression zero times or more. |
re+ | Match the preceding subexpression once or more. |
re? | Match the preceding subexpression zero times or once. |
re{ n} | Match the preceding subexpression n times. |
re{ n,} | Match the preceding subexpression n times or more. |
re{ n, m} | Match the preceding subexpression at least n times and at most m times. |
a| b | Match a or b. |
(re) | Group the regular expression and remember the matched text. |
(?imx) | Temporarily enable the i, m, or x options within the regular expression. If within parentheses, it only affects the part within the parentheses. |
(?-imx) | Temporarily disable the i, m, or x options within the regular expression. If within parentheses, it only affects the part within the parentheses. |
(?: re) | Group the regular expression but do not remember the matched text. |
(?imx: re) | Temporarily enable the i, m, or x options within parentheses. |
(?-imx: re) | Temporarily disable the i, m, or x options within parentheses. |
(?#...) | Comment. |
(?= re) | Use the pattern to specify a position. No range. |
(?! re) | Use the negative of the pattern to specify a position. No range. |
(?> re) | Match an independent pattern without backtracking. |
\w | Match word characters. |
\W | Match non-word characters. |
\s | Match whitespace character. Equivalent to [\t\n\r\f]. |
\S | Match non-whitespace character. |
\d | Match digit. Equivalent to [0-9]。 |
\D | Match non-digit. |
\A | Match the beginning of the string. |
\Z | Match the end of the string. If there is a newline, only match up to the newline. |
\z | Match the end of the string. |
\G | Match the last matched dot. |
\b | When outside of parentheses, match word boundary; when inside parentheses, match backspace key (0x08)。 |
\B | Match non-word boundary. |
\n, \t, etc. | Match newline, carriage return, tab, etc. |
\1...\9 | Match the nth group subexpression. |
\10 | If it has been matched, then match the nth group subexpression. Otherwise, point to the octal representation of the character encoding. |
Example | Description |
---|---|
/ruby/ | Match "ruby" |
¥ | Match Yen symbol. Ruby 1.9 and Ruby 1.8 Supports multiple characters. |
Example | Description |
---|---|
/[Rr]uby/ | Match "Ruby" or "ruby" |
/rub[ye]/ | Matches "ruby" or "rube" |
/[aeiou]/ | Match any lowercase vowel letter |
/[0-9]/ | Match any digit, with /[0123456789]/ Same |
/[a-z]/ | Match any lowercase ASCII letter |
/[A-Z]/ | Match any uppercase ASCII letter |
/[a-zA-Z0-9]/ | Match any character within the parentheses |
/[^aeiou]/ | Match any character that is not a lowercase vowel letter |
/[^0-9]/ | Match any non-digit character |
Example | Description |
---|---|
/./ | Match any character except a newline |
/./m | In multiline mode, can also match newline characters |
/\d/ | Match a digit, equivalent to /[0-9]/ |
/\D/ | Match a non-digit, equivalent to /[^0-9]/ |
/\s/ | Match a whitespace character, equivalent to /[ \t\r\n\f]/ |
/\S/ | Match a non-whitespace character, equivalent to /[^ \t\r\n\f]/ |
/\w/ | Match a word character, equivalent to /[A-Za-z0-9_]/ |
/\W/ | Match a non-word character, equivalent to /[^A-Za-z0-9_]/ |
Example | Description |
---|---|
/ruby?/ | Match "rub" or "ruby". Among them, y is optional. |
/ruby*/ | Match "rub" plus 0 or more of y. |
/ruby+/ | Match "rub" plus 1 a number or more of y. |
/\d{3}/ | Exactly match 3 a number. |
/\d{3,}/ | Match 3 a number or more. |
/\d{3,5}/ | Match 3 or4 or 5 a number. |
This will match the minimum number of repetitions.
Example | Description |
---|---|
/<.*>/ | Greedy repetition: Match "<ruby>perl>" |
/<.*?>/ | Non-greedy repetition: Match the "<ruby>perl>" within "<ruby>" |
Example | Description |
---|---|
/\D\d+/ | No grouping: + Repeat \d |
/(\D\d)+/ | Grouping: + Repeat \D\d pairs |
/([Rr]uby(, )?)+/ | Matches "Ruby", "Ruby, ruby, ruby", and so on |
This will match the previously matched group again.
Example | Description |
---|---|
/([Rr])uby&\1ails/ | Matches ruby&rails or Ruby&Rails |
/(['"])(?:(?!\1).)*\1/ | Single or double quoted strings.1 Matches the characters matched by the first group, \2 Matches the characters matched by the second group, and so on. |
Example | Description |
---|---|
/ruby|rube/ | Matches "ruby" or "rube" |
/rub(y|le)/ | Matches "ruby" or "ruble" |
/ruby(!+|\?)/ | "ruby" followed by one or more ! or followed by a ? |
This requires specifying the matching position.
Example | Description |
---|---|
/^Ruby/ | Matches a string starting with "Ruby" or a line |
/Ruby$/ | Matches a string ending with "Ruby" or a line |
/\ARuby/ | Matches a string starting with "Ruby" |
/Ruby\Z/ | Matches a string ending with "Ruby" |
/\bRuby\b/ | Matches the word boundary "Ruby" |
/\brub\B/ | \B is a non-word boundary: matches "rube" and "ruby" in "rube" and "ruby", but does not match "rub" alone |
/Ruby(?=!)/ | If there is an exclamation mark after "Ruby", it matches "Ruby" |
/Ruby(?!!)/ | If there is no exclamation mark after "Ruby", it matches "Ruby" |
Example | Description |
---|---|
/R(?#comment)/ | Matches "R". All remaining characters are comments. |
/R(?i)uby/ | Case-insensitive match for "uby". |
/R(?i:uby)/ | The same as above. |
/rub(?:y|le))/ | Grouping only, without doing \\1 Backreference |
sub and gsub and their replacement variables sub! and gsub! are important string methods when using regular expressions.
All these methods are used to perform search and replace operations using regular expression patterns.sub and sub! Replace the first occurrence of the pattern.gsub and gsub! Replace all occurrences of the pattern.
sub and gsub Returns a new string without modifying the original string. sub! and gsub! It will modify the strings they call.
#!/usr/bin/ruby # -*- coding: UTF-8 -*- phone = "138-3453-1111 # This is a phone number" # Delete Ruby comments phone = phone.sub!(/#.*$/, ") puts "Phone Number : #{phone}" # Remove all characters except numbers phone = phone.gsub!("/\D/, ") puts "Phone Number : #{phone}"
The output of the above example is:
Phone Number : 138-3453-1111 Phone Number : 13834531111
#!/usr/bin/ruby # -*- coding: UTF-8 -*- text = "rails is rails, Ruby on Rails is a very good Ruby framework" # Change all "rails" to "Rails" text.gsub!("rails", "Rails") # Change all the words "Rails" to uppercase text.gsub!("/\brails\b/, "Rails") puts "#{text}"
The output of the above example is:
Rails is a very good Ruby framework for Ruby on Rails