English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Scala Regular Expressions

Scala uses the scala.util.matching package to Regex class to support regular expressions. The following example demonstrates using regular expressions to search for words Scala :

import scala.util.matching.Regex
object Test {
   def main(args: Array[String]) {
      val pattern = "Scala".r
      val str = "Scala is Scalable and cool"
      
      println(pattern findFirstIn str)
   }
}

Execute the above code, the output will be:

$ scalac Test.scala 
$ scala Test
Some(Scala)

The example uses the r() method of the String class to construct a Regex object.

Then use the findFirstIn method to find the first matched item.

If you need to view all the matched items, you can use the findAllIn method.

You can use the mkString( ) method to concatenate the string of regular expression matching results, and you can use the pipe (|) to set different patterns:

import scala.util.matching.Regex
object Test {
   def main(args: Array[String]) {
      val pattern = new Regex("(S|s)cala")  // The first letter can be uppercase S or lowercase s
      val str = "Scala is scalable and cool"
      
      println((pattern findAllIn str).mkString(","))   // Use commas , to connect the returned results
   }
}

Execute the above code, the output will be:

$ scalac Test.scala 
$ scala Test
Scala,scala

If you need to replace the matched text with a specified keyword, you can use replaceFirstIn( ) The method to replace the first matched item, using replaceAllIn( ) The method replaces all matched items, for example:

object Test {
   def main(args: Array[String]) {
      val pattern = "(S|s)cala".r
      val str = "Scala is scalable and cool"
      
      println(pattern replaceFirstIn(str, "Java"))
   }
}

Execute the above code, the output will be:

$ scalac Test.scala 
$ scala Test
Java is scalable and cool

Regular expression

Scala's regular expressions inherit the syntax rules of Java, while Java mostly uses the rules of the Perl language.

The following table lists some commonly used regular expression rules:

ExpressionMatching rule
^ Match the position at the beginning of the input string.
$Match the position at the end of the input string.
.Match any single character except "\r\n"
[...]Character class. Matches any character included. For example, "[abc]" matches "plain" with "a".
[^...]Negative character class. Matches any character not included. For example, "[^abc]" matches "plain" with "p", "l", "i", "n".
\\AMatch the start position of the input string (no multi-line support)
\\zString end (similar to $, but not affected by the multi-line option)
\\ZString end or line end (not affected by the multi-line option)
re*Repeat zero times or more
re+Repeat once or more
re?Repeat zero times or once
re{ n}Repeat n times
re{ n,}Repeat n times or more
re{ n, m}Repeat n to m times
a|bMatch a or b
(re)Match re, and capture the text into an automatically named group
(?: re)Match re, do not capture the matched text, and do not assign a group number to this group
(?> re)Greedy subexpression
\\wMatch a letter or digit or underscore or Chinese character
\\W]}Match any character that is not a letter, digit, underscore, or Chinese character
\\sMatch any whitespace character, equivalent to [\t\n\r\f]
\\SMatch any non-whitespace character
\\dMatch digits, similar to [0-9]
\\DMatch any non-digit character
\\GThe start of the current search
\\nNewline
\\bTypically word boundary positions, but if used within a character class, it represents a backspace
\\BMatch positions that are not the start or end of a word
\\tTab
\\QStart quote:\Q(a+b)*3\E Can match text "(a+b)*3".
\\EEnd quote:\Q(a+b)*3\E Can match text "(a+b)*3".

Regular expression examples

ExampleDescription
.Match any single character except "\r\n"
[Rr]ubyMatch "Ruby" or "ruby"
rub[ye]Match "ruby" or "rube"
[aeiou]Match lowercase letters: aeiou
[0-9]Match any digit, similar to [0123456789]
[a-z]Match any ASCII lowercase letter
[A-Z]Match any ASCII uppercase letter
[a-zA-Z0-9]Match digits and uppercase letters
[^aeiou]Match any character except aeiou
[^0-9]Match any character except digits
\\dMatch digits, similar to: [0-9]
\\DMatch non-digits, similar to: [^0-9]
\\sMatch whitespace characters, similar to: [ \t\r\n\f]
\\SMatch non-whitespace characters, similar to: [^ \t\r\n\f]
\\wMatch letters, numbers, underscores, similar to: [A-Za-z0-9_]
\\W]}Match non-letter, non-digit, non-underscore, similar to: [^A-Za-z0-9_]
ruby?Match "rub" or "ruby": y is optional
ruby*Match "rub" followed by 0 or more of y.
ruby+Match "rub" followed by 1 numbers.
\\d{3}Exactly match 3 numbers.
\\d{3,}Match 3 numbers.
\\d{3,5}Match 3 or4 or 5 numbers.
\\D\\d+No grouping: + Repeat \d
(\\D\\d)+/Grouping: + Repeat \D\d pairs
([Rr]uby(, )?)+Match "Ruby", "Ruby, ruby, ruby", etc.

Note that each character in the table above is represented by two backslashes. This is because in Java and Scala, the backslash is an escape character in strings. So if you want to output \, you need to write \\ in the string to get a single backslash. See the following example:

import scala.util.matching.Regex
object Test {
   def main(args: Array[String]) {
      val pattern = new Regex("abl[ae]\d+")
      val str = "ablaw is able1 and cool
      
      println((pattern findAllIn str).mkString(","))
   }
}

Execute the above code, the output will be:

$ scalac Test.scala 
$ scala Test
able1