Ruby Basic Tutorial

Ruby Advanced Tutorial

Ruby Strings (String)

The String object in Ruby is used to store or manipulate a sequence of one or more bytes.

Ruby strings are divided into single-quoted strings ('') and double-quoted strings (""), the difference being that double-quoted strings can support more escape characters.

Single-quoted strings

The simplest string is a single-quoted string, which is a string placed within single quotes:

'This is a Ruby program's string'

If you need to use a single quote character within a single-quoted string, you need to use a backslash (\) in the single-quoted string so that the Ruby interpreter will not consider this single quote character as the end of the string:

'Won\'t you read O\'Reilly\'s book?'

The backslash can also escape another backslash, so the second backslash itself will not be interpreted as an escape character.

The following are the features of strings in Ruby.

Double-quoted strings

We can use in double-quoted strings: #{} Hash and curly braces to calculate the value of an expression:

Embedding variables in a string:

Online example

#!/usr/bin/ruby
#　-*-　coding: UTF-8　-*-
　
name1　= "Joe"
name2　= "Mary"
puts "Hello #{name1}, #{name2}} where?

The output of the above example is as follows:

Hello Joe, where is Mary?

Mathematical operations in a string:

Online example

#!/usr/bin/ruby
#　-*-　coding: UTF-8　-*-
　
x, y, z =　12,　36,　72
puts "The value of x is #{x"
puts "x　+　The value of y is #{x　+　y"
puts "x　+　y　+　The average of z is #{(x　+　y　+　z)/3　"}

The output of the above example is as follows:

The value of x is　12
x　+　The value of y is　48
x　+　y　+　The average value of z is　40

Ruby also supports a string variable that is guided by %q and %Q. %q uses single quote quoting rules, while %Q uses double quote quoting rules, followed by a starting delimiter such as (! [ { and an ending delimiter such as } ] ) etc.

The characters following q or Q are delimiters. Delimiters can be any non-alphanumeric single-byte character. For example: [,{,(,<,!, etc., the string will continue to be read until a matching end delimiter is found.

Online example

#!/usr/bin/ruby
#　-*-　coding: UTF-8　-*-
　
desc1　= %Q{Ruby strings can use '' and "".}
desc2　= %q|Ruby strings can use '' and "".|
　
puts desc1
puts desc2

The output of the above example is as follows:

Ruby strings can use '' and "".
Ruby strings can use '' and "".

Escape characters

The index lists the escape characters or non-printable characters that can be escaped with the backslash symbol.

Note:Escape characters are parsed within a string enclosed in double quotes. Escape characters are not parsed and are output as is within a string enclosed in single quotes.

Backslash symbol	Hexadecimal character	Description
\a	0x07	Alarm character
\b	0x08	Backspace key
\cx		Control-x
\C-x		Control-x
\e	0x1b	Escape character
\f	0x0c	Form feed
\M-\C-x		Meta-Control-x
\n	0x0a	Newline
\nnn		Octal representation, where n ranges from 0.7
\r	0x0d	Carriage return
\s	0x20	Space character
\t	0x09	Tab
\v	0x0b	Vertical tab
\x		Character x
\xnn		Hexadecimal representation, where n ranges from 0.9, a.f or A.F

Character encoding

Ruby's default character set is ASCII, where characters can be represented by a single byte. If you use UTF-8 or other modern character sets, where characters may be represented by one to four bytes.

You can change the character set at the beginning of the program, as shown below:

$KCODE = 'u'

Below are the possible values of $KCODE.

Encoding	Description
a	ASCII (the same as none). This is the default.
e	EUC.
n	None (the same as ASCII).
u	UTF-8.

Built-in string methods

We need an example of a String object to call the String method. Here are some ways to create a String object:

new[String.new(str="")]

This will return a string containing str A new string object for the copy. Now, using str Object, we can call any available example method. For example:

Online example

#!/usr/bin/ruby
　
myStr　=　String.new("THIS　IS　TEST")
foo　=　myStr.downcase
　
puts　"#{foo}"

This will produce the following result:

this　is　test

The following are public string methods (assuming str is a String object):

Serial Number	Method & Description
1	str % arg Use the format specification to format the string. If arg contains more than one replacement, arg must be an array. For more information about format specifications, please see the "kernel module" under sprintf.
2	*str integer** Return a new string containing integer str. In other words, str is repeated integer times.
3	str + other_str Concatenates other_str to str.
4	str << obj Concatenate an object to the string. If the object is of range 0.255 Fixed number between, if it is converted to a character. Compare it with concat.
5	str <=> other_str Compare str with other_str, return -1(less than), 0 (equal) or 1(greater than). The comparison is case-sensitive.
6	str == obj Check the equality of str and obj. If obj is not a string, return false, if str <=> obj, return true, return 0.
7	str =~ obj Match str with the regular expression pattern obj. Return the starting position of the match, or false otherwise.
8	str[position] # Note that the return is ASCII code rather than a character str[start, length] str[start..end] str[start...end] Use index to cut sub-strings
9	str.capitalize Display the string in uppercase letters.
10	str.capitalize! Similar to capitalize, but str will be modified and returned.
11	str.casecmp Case-insensitive string comparison.
12	str.center Center the string.
13	str.chomp Remove the record separator from the end of the string ($)/It is usually \n. No operation is performed if there is no record separator.
14	str.chomp! Similar to chomp, but str will be modified and returned.
15	str.chop Removes the last character from str.
16	str.chop! Same as chop, but str will be changed and returned.
17	str.concat(other_str) Concatenates other_str to str.
18	str.count(str, ...) Counts the characters in one or more character sets. If there are multiple character sets, counts the intersection of these sets.
19	str.crypt(other_str) Applies a one-way encryption hash to str. The parameter is a two-character string, each character ranging from a.z, A.Z, 0.9、 . Or /.
20	str.delete(other_str, ...) Returns a copy of str with all characters in the intersection of the parameters removed.
21	str.delete!(other_str, ...) Same as delete, but str will be changed and returned.
22	str.downcase Returns a copy of str with all uppercase letters replaced by lowercase letters.
23	str.downcase! Same as downcase, but str will be changed and returned.
24	str.dump Returns a version of str with all non-printable characters replaced by \nnn symbols and all special characters escaped.
25	str.each(separator=$/) { \|substr\| block } Uses the parameter as the record separator (default is $/) splits str, passing each substring to the block provided.
26	str.each_byte { \|fixnum\| block } Passes each byte of str to the block, returning each byte in decimal notation.
27	str.each_line(separator=$/) { \|substr\| block } Uses the parameter as the record separator (default is $/Splits str with separator, passing each substring to the block provided.
28	str.empty? Returns true if str is empty (i.e., its length is 0).
29	str.eql?(other) Two strings are equal if they have the same length and content.
30	str.gsub(pattern, replacement) [or] str.gsub(pattern) { \|match\| block } Returns a copy of str with all occurrences of pattern replaced by the value of replacement or block./\d/ Matches a digit, but '\d' matches a backslash followed by a 'd').
31	str[fixnum] [or] str[fixnum,fixnum] [or] str[range] [or] str[regexp] [or] str[regexp, fixnum] [or] str[other_str] Use the following parameters to reference str: a Fixnum parameter returns the character encoding of the Fixnum; two Fixnum parameters return a substring from the offset (the first Fixnum) to the length (the second Fixnum); a range parameter returns a substring within that range; a regexp parameter returns the part of the string that matches; a regexp with a Fixnum returns the match data at the Fixnum position; an other_str parameter returns the substring that matches other_str. A negative Fixnum starts from the end of the string -1 Start.
32	str[fixnum] = fixnum [or] str[fixnum] = new_str [or] str[fixnum, fixnum] = new_str [or] str[range] = aString [or] str[regexp] =new_str [or] str[regexp, fixnum] =new_str [or] str[other_str] = new_str ] Replaces the entire string or part of the string. Synonymous with slice!.
33	str.gsub!(pattern, replacement) [or] str.gsub!(pattern) { \|match\| block } Performs the replacement with String#gsub, returns str, and if no replacement is performed, it returns nil.
34	str.hash Returns a hash based on the length and content of the string.
35	str.hex Takes the leading characters of str as a string of hexadecimal digits (an optional sign and an optional 0x), and returns the corresponding number. If an error occurs, it returns zero.
36	str.include? other_str [or] str.include? fixnum If str contains the given string or character, it returns true.
37	str.index(substring [, offset]) [or] str.index(fixnum [, offset]) [or] str.index(regexp [, offset]) Returns the index of the first occurrence of the given substring, character (fixnum), or pattern (regexp) in str. If not found, it returns nil. If the second parameter is provided, it specifies the starting position for the search in the string.
38	str.insert(index, other_str) Insert other_str before the character at the given index, modify str. Negative indices start counting from the end of the string, and insert after the given character. The intention is to start inserting a string at the given index.
39	str.inspect Returns the printable version of str with escaped special characters.
40	str.intern [or] str.to_sym Returns the symbol corresponding to str, or creates a symbol if it does not exist previously.
41	str.length Returns the length of str. Compare it with size.
42	str.ljust(integer, padstr=' ') If integer is greater than the length of str, return a new string of length integer, which is left-aligned with str and padded with padstr. Otherwise, return str.
43	str.lstrip Returns a copy of str with the leading spaces removed.
44	str.lstrip! Remove the leading spaces from str, and return nil if there is no change.
45	str.match(pattern) If pattern is not a regular expression, then convert pattern to a regular expression Regexp, and then call its matching method on str.
46	str.oct Takes the leading character of str as a string of decimal digits (an optional sign), and returns the corresponding number. If the conversion fails, it returns 0.
47	str.replace(other_str) Replace the content in str with the corresponding value in other_str.
48	str.reverse Returns a new string, which is the reverse of str.
49	str.reverse! Reverse str, str will be changed and returned.
50	str.rindex(substring [, fixnum]) [or] str.rindex(fixnum [, fixnum]) [or] str.rindex(regexp [, fixnum]) Return the index of the last occurrence of the given substring, character (fixnum), or pattern (regexp) in str. If not found, return nil. If the second parameter is provided, it specifies the position in the string where the search ends. Characters beyond this point will not be considered.
51	str.rjust(integer, padstr=' ') If the integer is greater than the length of str, return a new string of length integer, right-aligned, and padded with padstr. Otherwise, return str.
52	str.rstrip Return a copy of str with trailing spaces removed.
53	str.rstrip! Remove trailing spaces from str and return nil if there is no change.
54	str.scan(pattern) [or] str.scan(pattern) { \|match, ...\| block } Two forms match the pattern (which can be a Regexp or a String) to traverse str. For each match, a result is generated, which is added to the result array or passed to the block. If the pattern does not contain groups, each independent result consists of the matched string and $&. If the pattern contains groups, each independent result is an array containing each group entry.
55	str.slice(fixnum) [or] str.slice(fixnum, fixnum) [or] str.slice(range) [or] str.slice(regexp) [or] str.slice(regexp, fixnum) [or] str.slice(other_str) See str[fixnum], etc. str.slice!(fixnum) [or] str.slice!(fixnum, fixnum) [or] str.slice!(range) [or] str.slice!(regexp) [or] str.slice!(other_str) Remove the specified part from str and return the removed part. If the value is out of range, an IndexError will be generated if the parameter is in the form of Fixnum. A RangeError will be generated if the parameter is in the form of range. If the parameter is in the form of Regexp and String, the action will be ignored.
56	str.split(pattern=$;, [limit]) Splits str into substrings based on the delimiter and returns an array of these substrings. If pattern is a string String, then it will be used as a delimiter when splitting str. If pattern is a single space, then str is split based on whitespace, and leading and consecutive whitespace characters will be ignored. If pattern 　is a regular expression Regexp, then str is split at the places where pattern matches. When pattern matches a zero-length string, str is split into individual characters. If the pattern parameter is omitted, the value of $; is used. If $; is nil (the default), str is split based on whitespace, as if the space ` ` was specified as the delimiter. If the limit Parameter, will suppress trailing null fields. If limit is a positive number, it will return at most that many fields (if limit is 1Returns the entire string as the only entry in the array. If limit is a negative number, the number of returned fields is not limited, and trailing null fields are not suppressed.
57	*str.squeeze([other_str])** Builds a sequence of characters from the other_str parameter using the program described by String#count. Returns a new string where repeated characters in the sequence are replaced with a single character. If no parameter is given, all repeated characters are replaced with a single character.
58	*str.squeeze!([other_str])** Similar to squeeze, but str will be changed and returned, and nil will be returned if there is no change.
59	str.strip Returns a copy of str with leading and trailing whitespace removed.
60	str.strip! Removes leading and trailing whitespace from str, and returns nil if there is no change.
61	str.sub(pattern, replacement) [or] str.sub(pattern) { \|match\| block } Returns a copy of str with the first occurrence of pattern replaced by replacement or the value of block. The pattern is usually a regular expression Regexp; if it is a string String, then no regular expression meta-characters are interpreted.
62	str.sub!(pattern, replacement) [or] str.sub!(pattern) { \|match\| block } Performs String#sub replacement and returns 'str'; if no replacement is performed, returns nil.
63	str.succ [or] str.next Returns the inherited value.
64	str.succ! [or] str.next! Similar to String#succ, but 'str' will be modified and returned.
65	str.sum(n=16) Returns the n-bit checksum, where n is an optional Fixnum parameter, defaulting to 16The result is simply the sum of the binary values of each character in 'str', with 2n - 1 is the modulus. This is not a particularly good checksum.
66	str.swapcase Returns a copy of 'str' with all uppercase letters converted to lowercase and all lowercase letters converted to uppercase.
67	str.swapcase! Similar to String#swapcase, but 'str' will be modified and returned; if no change is made, nil is returned.
68	str.to_f Returns the result of interpreting the leading characters of 'str' as a floating-point number. Any extra characters at the end of the valid number are ignored. If there is no valid number at the beginning of 'str', it returns 0.0. This method does not raise an exception.
69	str.to_i(base=10) Returns the result of interpreting the leading characters of 'str' as an integer base (base 2、 8、 10 or 16The result ignores any extra characters at the end of the valid number. If there is no valid number at the beginning of 'str', it returns 0. This method does not raise an exception.
70	str.to_s [or] str.to_str Returns the received value.
71	str.tr(from_str, to_str) Returns a copy of 'str' with characters in 'from_str' replaced by corresponding characters in 'to_str'. If 'to_str' is shorter than 'from_str', it will be filled with the last character. Both strings can use 'c'.1.c2 Symbol indicates the range of characters. If 'from_str' starts with '^', it means all characters except those listed.
72	str.tr!(from_str, to_str) Similar to String#tr, but 'str' will be modified and returned; if no change is made, nil is returned.
73	str.tr_s(from_str, to_str) Process 'str' according to the rules described by String#tr, and then remove duplicate characters that may affect translation.
74	str.tr_s!(from_str, to_str) Is equivalent to String#tr_s, but str will change and return, or return nil if no change is made.
75	str.unpack(format) Decode str (which may contain binary data) according to the format string, and return an array of each extracted value. The format character consists of a series of single-character instructions. A number may follow each instruction, indicating the number of times to repeat the instruction. The asterisk (*The format string will use all remaining elements. The instruction sSiIlL may be followed by an underscore (_) each time, using the local size of the underlying platform for the specified type, otherwise using a consistent size independent of the platform. Spaces in the format string will be ignored.
76	str.upcase Return a copy of str with all lowercase letters replaced with uppercase letters. The operation is case-insensitive, only characters a to z are affected.
77	str.upcase! Change the content of str to uppercase, and return nil if no change is made.
78	str.upto(other_str) { \|s\| block } Traverse consecutive values from str to other_str (inclusive), passing each value to block in turn. The String#succ method is used to generate each value.

String unpack command

The following table lists the unpack commands of the String#unpack method.

Command	Return	Description
A	String	Remove trailing null and spaces.
a	String	String.
B	String	Extract bits from each character (starting with the most significant bit).
b	String	Extract bits from each character (starting with the least significant bit).
C	Fixnum	Extract a character as an unsigned integer.
c	Fixnum	Extract a character as an integer.
D, d	Float	Treat characters of length sizeof(double) as native double.
E	Float	Treat characters of length sizeof(double) as littleendian byte order double.
e	Float	Treat characters of length sizeof(float) as littleendian byte order float.
F, f	Float	Treat characters of length sizeof(float) as native float.
G	Float	Treat characters of length sizeof(double) as double in network byte order.
g	Float	Treat characters of length sizeof(float) as float in network byte order.
H	String	Extract hexadecimal from each character (first the most significant bit).
h	String	Extract hexadecimal from each character (first the least significant bit).
I	Integer	Treat consecutive characters of sizeof(int) length (modified by _) as a native integer.
i	Integer	Treat consecutive characters of sizeof(int) length (modified by _) as a signed native integer.
L	Integer	Treat four (modified by _) consecutive characters as an unsigned native long integer.
l	Integer	Treat four (modified by _) consecutive characters as a signed native long integer.
M	String	Quote printable.
m	String	Base64 encoding.
N	Integer	Treat four characters as network byte order unsigned long.
n	Fixnum	Treat two characters as network byte order unsigned short.
P	String	Treat sizeof(char *) length characters as a pointer and return \emph{len} characters from the reference position.
p	String	Treat sizeof(char *) length characters as a pointer to a null-terminated character.
Q	Integer	Treat eight characters as an unsigned quad word (64 bit).
q	Integer	Treat eight characters as a signed quad word (64 bit).
S	Fixnum	Treat two (if _ is used, it is different) consecutive characters as native byte order unsigned short.
s	Fixnum	Treat two (if _ is used, it is different) consecutive characters as native byte order signed short.
U	Integer	UTF-8 character, as an unsigned integer.
u	String	UU encoding.
V	Fixnum	Treat four characters as little-unsigned long with endian byte order.
v	Fixnum	Treat two characters as little-unsigned short with endian byte order.
w	Integer	BER compressed integer.
X		Skip one character backward.
x		Skip one character forward.
Z	String	and * Used together, remove trailing nulls until the first null.
@		Skip the offset given by the length parameter.

Online example

Try the following examples to unpack various data.

"abc \0\0abc \0\0".unpack('A'6Z6#=> ['abc', 'abc ']
"abc \0\0".unpack('a'3a3#=> ['abc', '  \000\000']
"abc \0abc \0".unpack('Z')*Z*
"aa".unpack('b')8B810000110", "01100001"]
"aaa".unpack('h')2H2c') => ['"']16", "61"　97]
"\xfe\xff\xfe\xff".unpack('sS') => [-2,　65534]
"now="20is".unpack('M')*
"whole".unpack('xax') => ['now is']2aX2aX1aX2a') => ['h', 'e', 'l', 'l', 'o']

Ruby Arrays (Array)Ruby Modules