English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

R Factors

Factors are used to store data types of different categories, such as gender with male and female categories, and age can be divided into minors and adults.

R language creates factors using the factor() function, where vectors are used as input parameters.

factor() function syntax format:

factor(x = character(), levels, labels = levels,
       exclude = NA, ordered = is.ordered(x), nmax = NA)

Parameter Description:

  • x: Vector.

  • levels: Specify the level values, if not specified, obtained from the different values of x.

  • labels: Labels of the levels, if not specified, use the corresponding string of each level value.

  • exclude: Characters to be excluded.

  • ordered: Logical value, used to specify whether the levels are ordered.

  • nmax: The upper limit of the number of levels.

The following example converts a character vector to a factor:

x <- c("Male", "Female", "Male", "Male", "Female")
sex <- factor(x)
print(sex)
print(is.factor(sex))

The output of executing the above code is:

[1] Male Female Male Male Female
Levels: Male Female
[1] TRUE

The following example sets the factor levels to c('Male','Female'):

x <- c("Male", "Female", "Male", "Male", "Female", levels=c('Male','Female'))
sex <- factor(x)
print(sex)
print(is.factor(sex))

The output of executing the above code is:

levels1 levels2 
Male            Female            Male            Male            Female            Male            Female 
Levels: Male Female
[1] TRUE

Factor Level Labels

Next, we use the labels parameter to add labels to each factor level. The character order of the labels parameter must be consistent with the character order of the levels parameter, for example:

sex=factor(c('f','m','f','f','m'),levels=c('f','m'),labels=c('female','male'),ordered=TRUE)
print(sex)

The output of executing the above code is:

[1] female male female female male  
Levels: female < male

Generate Factor Levels

We can use the gl() function to generate factor levels, the syntax format is as follows:

gl(n, k, length = n*k, labels = seq_len(n), ordered = FALSE)

Parameter Description:

  • n: Set the number of levels

  • k: Set the number of repetitions for each level

  • length: Set the length

  • labels: Set the value of level

  • ordered: Set whether level is ordered, a boolean value.

v <- gl(3, 4, labels = c("Google", "w3codebox,"Taobao"
print(v)

The output of executing the above code is:

 [1] Google Google Google Google w3codebox w3codebox w3codebox w3codebox Taobao Taobao
[11] Taobao Taobao
Levels: Google w3codebox Taobao