English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

R Data Frames

Data frame (Data frame) can be understood as the 'table' we often say.

Data frame is a data structure in R language, which is a special two-dimensional list.

Each column of the data frame has a unique column name, the length of which is equal, the data type of the same column needs to be consistent, and the data types of different columns can be different.

R language data frames are created using the data.frame() function, the syntax format is as follows:

data.frame(…, row.names = NULL, check.rows = FALSE,
           check.names = TRUE, fix.empty.names = TRUE,
           stringsAsFactors = default.stringsAsFactors())
  • ...: Column vector, can be of any type (character, numeric, logical), generally represented in the form of tag = value, or just value.

  • row.names: Column name, default is NULL, can be set to a single number, string, or a vector of string and number.

  • check.rows: Check if the row names and lengths are consistent.

  • check.names: Check if the variable names of the data frame are valid.

  • fix.empty.names: Set whether the unnamed parameters are automatically named.

  • stringsAsFactors: Boolean value, whether characters are converted to factors, factory-fresh's default value is TRUE, which can be modified by setting options (stringsAsFactors=FALSE).

The following creates a simple data frame containing Name, ID, Monthly Salary:

table = data.frame(
    Name = c("Zhang San", "Li Si"),
    Work Number = c("001","002",
    Monthly Salary = c(1000, 2000)
    
)
print(table) # View table data

The output of the above code is as follows:

Name Work Number Monthly Salary
1 Zhangsan  001 1000
2 Lishiqi  002 2000

The data structure of the data frame can be obtained through str() The function to display is:

table = data.frame(
    Name = c("Zhang San", "Li Si"),
    Work Number = c("001","002",
    Monthly Salary = c(1000, 2000)
)
# Get Data Structure
str(table)

The output of the above code is as follows:

'data.frame':   2 obs. of  3 variables:
 $ Name: chr "Zhang San" "Li Si"
 $ ID: chr "001" 002"
 $ Monthly Salary: num  1000 2000

summary() You can display the summary information of the data frame:

table = data.frame(
    Name = c("Zhang San", "Li Si"),
    Work Number = c("001","002",
    Monthly Salary = c(1000, 2000)
    
)
# Show Summary
print(summary(table))

The output of the above code is as follows:

Name ID Monthly Salary     
Length:2           Length:2           Min. :1000  
Class :character Class :character   1st Qu. :1250  
Mode :character Mode :character Median :1500  
                                      Mean :1500  
                                      3rd Qu. :1750  
                                      Max. :2000

We can also extract specified columns:

table = data.frame(
    Name = c("Zhang San", "Li Si"),
    Work Number = c("001","002",
    Monthly Salary = c(1000, 2000)
)
# Extract specified columns
result <- data.frame(table$Name,table$Salary)
print(result)

The output of the above code is as follows:

table.Name table.Salary
1       Zhang San       1000
2       Li Si       2000

The following format displays the first two rows:

table = data.frame(
    Name = c("Zhangsan", "Lishiqi","Wangwu"),
    Work Number = c("001","002","003",
    Monthly Salary = c(1000, 2000,3000)
)
print(table)
# Extract the first two rows
print("---Output the first two rows----))
result <- table[1:2,]
print(result)

The output of the above code is as follows:

Name Work Number Monthly Salary
1 Zhangsan  001 1000
2 Lishiqi  002 2000
3 Wangwu  003 3000
[1] "---Output the first two rows----"
  Name Work Number Monthly Salary
1 Zhangsan  001 1000
2 Lishiqi  002 2000

We can read data from a specific column of a specified row in a similar way to coordinates, the following example reads the 2 ,3 The row number 1 ,2 Column Data: :

table = data.frame(
    Name = c("Zhangsan", "Lishiqi","Wangwu"),
    Work Number = c("001","002","003",
    Monthly Salary = c(1000, 2000,3000)
)
# Read the 2 ,3 The row number 1 ,2 Column Data:
result <- table[c(2,3)),c(1,2)]
print(result)

The output of the above code is as follows:

Name ID
2 Lishiqi  002
3 Wangwu  003

Expand Data Frame

We can expand an existing data frame, the following example adds a department column:

table = data.frame(
    Name = c("Zhangsan", "Lishiqi","Wangwu"),
    Work Number = c("001","002","003",
    Monthly Salary = c(1000, 2000,3000)
)
# Add Department Column
table$Department <- c("Operation","Technology","Editor")
print(table)

The output of the above code is as follows:

Name ID Monthly Salary Department
1 Zhangsan  001 1000 Operation
2 Lishiqi  002 2000 Technology
3 Wangwu  003 3000 Edit

We can use cbind() Function to combine multiple vectors into a data frame:

# Create vector
sites <- c("Google","w3codebox","Taobao")
likes <- c(222,111,123)
url <- c("www.google.com","www.oldtoolbag.com,"www.taobao.com")
# Combine vectors into a data frame
addresses <- cbind(sites,likes,url)
# View data frame
print(addresses)

The output of the above code is as follows:

     sites    likes url             
[1,] "Google" "222" "www.google.com"
[2,] "w3codebox" "111" "www.oldtoolbag.com"
[3,] "Taobao" "123" "www.taobao.com"

If you want to merge two data frames, you can use rbind() Function:

table = data.frame(
    Name = c("Zhangsan", "Lishiqi","Wangwu"),
    Work Number = c("001","002","003",
    Monthly Salary = c(1000, 2000,3000)
)
newtable = data.frame(
    Name = c("Xiaoming", "Little White"),
    Work Number = c("101",102",
    Monthly Salary = c(5000, 7000)
)
# Merge two data frames
result <- rbind(table,newtable)
print(result)

The output of the above code is as follows:

Name Work Number Monthly Salary
1 Zhangsan  001 1000
2 Lishiqi  002 2000
3 Wangwu  003 3000
4 Xiaoming  101 5000
5 Little White  102 7000