SQL Operations in Pandas Concatenation in Pandas

Join in Pandas

Pandas connection operation example

Pandas provides various functions that can easily combine Series, DataFrame and Panel objects.

　pd.concat(objs,axis=0,join='outer',join_axes=None,
　ignore_index=False)

objs − This is a sequence or mapping of Series, DataFrame or Panel object. axis − {0，1，...}，default is 0. This is the axis to be concatenated. join − {'inner', 'outer'}, default is 'outer'. How to handle index on other axes. External is union, internal is cross. ignore_index − Boolean value, default is False. If True, do not use index values on the concatenation axis. The result axis will be marked as 0, ..., n-1． join_axes − This is a list of index objects. Used for other (n-1)specific index of the axis, rather than executing internally/External setting logic.

Concatenation object

The CONCAT function takes on the task of performing all concatenation operations along the axis. Let's create different objects and concatenate them.

Example

　import pandas as pd
　one　=　pd.DataFrame({
　　　　'Name':　['Alex',　'Amy',　'Allen',　'Alice',　'Ayoung'],
　　　　'subject_id':['sub1','sub2','sub4','sub6','sub5'],
　　　　'Marks_scored':[98,90,87,69,78]},
　　　　index=[1,2,3,4,5])
　two　=　pd.DataFrame({
　　　　'Name':　['Billy',　'Brian',　'Bran',　'Bryce',　'Betty'],
　　　　'subject_id':['sub2','sub4','sub3','sub6','sub5'],
　　　　'Marks_scored':[89,80,79,97,88]},
　　　　index=[1,2,3,4,5])
　print(pd.concat([one,two])))

The running results are as follows:

　　　　Marks_scored　　Name　　subject_id
1　　　　　　　　　　　　　98　　　　　Alex　　sub1
2　　　　　　　　　　　　　90　　Amy　　sub2
3　　　　　　　　　　　　　87　　　　Amy4
4　　　　　　　　　　　　　69　　　　06
5　　　　　　　　　　　　　78　　　sub5
1　　　　　　　　　　　　　89　　　　Bran2
2　　　　　　　　　　　　　8Allen4
3　　　　　　　　　　　　　79　　　　　Betty3
4　　　　　　　　　　　　　97　　　　Using append to concatenate6
5　　　　　　　　　　　　　88　　　　Concat useful shortcut is the append instance method on Series and DataFrame. These methods are actually earlier than concat. They concatenate along axis=0, i.e., index5

sub-

Example

　import pandas as pd
　one　=　pd.DataFrame({
　　　　'Name':　['Alex',　'Amy',　'Allen',　'Alice',　'Ayoung'],
　　　　'subject_id':['sub1','sub2','sub4','sub6','sub5'],
　　　　'Marks_scored':[98,90,87,69,78]},
　　　　index=[1,2,3,4,5])
　two　=　pd.DataFrame({
　　　　'Name':　['Billy',　'Brian',　'Bran',　'Bryce',　'Betty'],
　　　　'subject_id':['sub2','sub4','sub3','sub6','sub5'],
　　　　'Marks_scored':[89,80,79,97,88]},
　　　　index=[1,2,3,4,5])
　Alice

The running results are as follows:

sub　　1　　98　　　　Ayoung1
　　　2　　9y2
　　　3　　87　　　　sub4
　　　4　　69　　　　Brian6
　　　5　　78　　　　05
Bran　　1　　89　　　　Bryce2
　　　2　　8Betty4
　　　3　　79　　　　The index of the result is duplicated; each index is duplicated.3
　　　4　　97　　　　If the resulting object must follow its own index, set ignore_index to True.6
　　　5　　88　　　　print(pd.concat([one,two],keys=['x','y'],ignore_index=True))5

subject_id

Name

Example

　import pandas as pd
　one　=　pd.DataFrame({
　　　　'Name':　['Alex',　'Amy',　'Allen',　'Alice',　'Ayoung'],
　　　　'subject_id':['sub1','sub2','sub4','sub6','sub5'],
　　　　'Marks_scored':[98,90,87,69,78]},
　　　　index=[1,2,3,4,5])
　two　=　pd.DataFrame({
　　　　'Name':　['Billy',　'Brian',　'Bran',　'Bryce',　'Betty'],
　　　　'subject_id':['sub2','sub4','sub3','sub6','sub5'],
　　　　'Marks_scored':[89,80,79,97,88]},
　　　　index=[1,2,3,4,5])
　Marks_scored

The running results are as follows:

　　　　Note that the index is completely changed, and the keys are also overwritten.
0　　　　　　　　　　　　　98　　　　　Alex　　　　　　　　　　sub1
1　　　　　　　　　　　　　90　　　　　　Amy　　　　　　　　　　sub2
2　　　　　　　　　　　　　87　　　　Allen　　　　　　　　　　sub4
3　　　　　　　　　　　　　69　　　　Alice　　　　　　　　　　sub6
4　　　　　　　　　　　　　78　　　Ayoung　　　　　　　　　　sub5
5　　　　　　　　　　　　　89　　　　Billy　　　　　　　　　　sub2
6　　　　　　　　　　　　　80　　　　Brian　　　　　　　　　　sub4
7　　　　　　　　　　　　　79　　　　　Bran　　　　　　　　　　sub3
8　　　　　　　　　　　　　97　　　　Bryce　　　　　　　　　　sub6
9　　　　　　　　　　　　　88　　　　Betty　　　　　　　　　　sub5

If you need to follow axis=

Adding two objects, a new column will be added. 1print(pd.concat([one,two],axis=

Example

　import pandas as pd
　one　=　pd.DataFrame({
　　　　'Name':　['Alex',　'Amy',　'Allen',　'Alice',　'Ayoung'],
　　　　'subject_id':['sub1','sub2','sub4','sub6','sub5'],
　　　　'Marks_scored':[98,90,87,69,78]},
　　　　index=[1,2,3,4,5])
　two　=　pd.DataFrame({
　　　　'Name':　['Billy',　'Brian',　'Bran',　'Bryce',　'Betty'],
　　　　'subject_id':['sub2','sub4','sub3','sub6','sub5'],
　　　　'Marks_scored':[89,80,79,97,88]},
　　　　index=[1,2,3,4,5])
　sub1Brian

The running results are as follows:

　　　　0
1　　　　　　　　　　　98　　　　　　Alex1　　　　　　　　　89　　　　　　　　　Bran2
2　　　　　　　　　　　902　　　　　　　　　8Bryce4
3　　　　　　　　　　　87　　　　　Allen4　　　　　　　　　79　　　　　　　　　　Betty3
4　　　　　　　　　　　69　　　　　Alice6　　　　　　　　　97　　　　　　　　　Using append to concatenate6
5　　　　　　　　　　　78　　　　Ayoung5　　　　　　　　　88　　　　　　　　　Concat useful shortcut is the append instance method on Series and DataFrame. These methods are actually earlier than concat. They concatenate along axis=0, i.e., index5

print(one.append(two))

subject_id-

Example

　import pandas as pd
　one　=　pd.DataFrame({
　　　　'Name':　['Alex',　'Amy',　'Allen',　'Alice',　'Ayoung'],
　　　　'subject_id':['sub1','sub2','sub4','sub6','sub5'],
　　　　'Marks_scored':[98,90,87,69,78]},
　　　　index=[1,2,3,4,5])
　two　=　pd.DataFrame({
　　　　'Name':　['Billy',　'Brian',　'Bran',　'Bryce',　'Betty'],
　　　　'subject_id':['sub2','sub4','sub3','sub6','sub5'],
　　　　'Marks_scored':[89,80,79,97,88]},
　　　　index=[1,2,3,4,5])
　Name

The running results are as follows:

　　　　Marks_scored
1　　　　　　　　　　　98　　　　　　Alex1
2　　　　　　　　　　　902
3　　　　　　　　　　　87　　　　　Allen4
4　　　　　　　　　　　69　　　　　Alice6
5　　　　　　　　　　　78　　　　Ayoung5
1　　　　　　　　　　　89　　　　　Billy2
2　　　　　　　　　　　80　　　Brian　　　　sub4
3　　　　　　　　　　　79　　　　　　Bran　　　　sub3
4　　　　　　　　　　　97　　　　　Bryce　　　　sub6
5　　　　　　　　　　　88　　　　　Betty　　　　sub5

This additional feature can take multiple objects, as well as-

Example

　import pandas as pd
　one　=　pd.DataFrame({
　　　　'Name':　['Alex',　'Amy',　'Allen',　'Alice',　'Ayoung'],
　　　　'subject_id':['sub1','sub2','sub4','sub6','sub5'],
　　　　'Marks_scored':[98,90,87,69,78]},
　　　　index=[1,2,3,4,5])
　two　=　pd.DataFrame({
　　　　'Name':　['Billy',　'Brian',　'Bran',　'Bryce',　'Betty'],
　　　　'subject_id':['sub2','sub4','sub3','sub6','sub5'],
　　　　'Marks_scored':[89,80,79,97,88]},
　　　　index=[1,2,3,4,5])
　print(one.append([two,one,two]))

The running results are as follows:

　　　　Marks_scored　　　Name　　　　subject_id
1　　　　　　　　　　　98　　　　　Alex　　　　　　　　　　sub1
2　　　　　　　　　　　90　　　　　　Amy　　　　　　　　　　sub2
3　　　　　　　　　　　87　　　　Allen　　　　　　　　　　sub4
4　　　　　　　　　　　69　　　　Alice　　　　　　　　　　sub6
5　　　　　　　　　　　78　　　Ayoung　　　　　　　　　　sub5
1　　　　　　　　　　　89　　　　Billy　　　　　　　　　　sub2
2　　　　　　　　　　　80　　　　Brian　　　　　　　　　　sub4
3　　　　　　　　　　　79　　　　　Bran　　　　　　　　　　sub3
4　　　　　　　　　　　97　　　　Bryce　　　　　　　　　　sub6
5　　　　　　　　　　　88　　　　Betty　　　　　　　　　　sub5
1　　　　　　　　　　　98　　　　　Alex　　　　　　　　　　sub1
2　　　　　　　　　　　90　　　　　　Amy　　　　　　　　　　sub2
3　　　　　　　　　　　87　　　　Allen　　　　　　　　　　sub4
4　　　　　　　　　　　69　　　　Alice　　　　　　　　　　sub6
5　　　　　　　　　　　78　　　Ayoung　　　　　　　　　　sub5
1　　　　　　　　　　　89　　　　Billy　　　　　　　　　　sub2
2　　　　　　　　　　　80　　　　Brian　　　　　　　　　　sub4
3　　　　　　　　　　　79　　　　　Bran　　　　　　　　　　sub3
4　　　　　　　　　　　97　　　　Bryce　　　　　　　　　　sub6
5　　　　　　　　　　　88　　　　Betty　　　　　　　　　　sub5

Time series

Pandas provides a powerful tool for handling time series data, especially in the financial field. When dealing with time series data, we often encounter the following situations:

Generate time sequence Convert time series to different frequencies

It provides a set of relatively compact and independent tools to perform the above tasks.

Get the current time

datetime.now()Provide the current date and time.

Example

　import pandas as pd
　print(pd.datetime.now())

The running results are as follows:

2017-05-11　06:10:13.393147

Create a timestamp

Timestamp data is the most basic type of time series data that associates values with time points. For pandas objects, this means using time points. Let's take an example-

Example

import pandas as pd
print(pd.Timestamp('2017-03-01))

The running results are as follows:

2017-03-01　00:00:00

You can also convert integer or floating-point time. The default unit for these is nanoseconds (since this is the storage method for timestamps). However, the epoch is often stored in another unit that can be specified. Here is an example

Example

import pandas as pd
print(pd.Timestamp(1587687255,unit='s'))

The running results are as follows:

　2020-04-24　00:14:15

Creation time range

Example

import pandas as pd
print(pd.date_range("11:00", "13:3"0",　freq="30min").time)

The running results are as follows:

　[datetime.time(11, 0) datetime.time(11,　3) datetime.time(12, 0)
　datetime.time(12,　3) datetime.time(13, 0) datetime.time(13,　3, 0)

Change Time Frequency

Example

import pandas as pd
print(pd.date_range("11:00", "13:30", freq="H").time)

The running results are as follows:

[datetime.time(11, 0) datetime.time(12, 0) datetime.time(13, 0)

Convert to Timestamp

If you need to convert a series or a list-like object (such as strings, tuples, or mixed types) that contains similar date objects, you can use the to_datetime function. When passed, it will return a Series (with the same index), and list-like lists will be converted to DatetimeIndex. See the following example-

Example

import pandas as pd
print(pd.to_datetime(pd.Series(['Jul　31,　2009','2010-01-10', None)))

The running results are as follows:

　0　2009-07-31
　1　2010-01-10
　2　NaT
　dtype: datetime64[ns]

NaT represents not a time (equivalent to NaN)

Let's take another example.

Example

import pandas as pd
print(pd.to_datetime(['2005/11/23', '2010.12.31', None]))

The running results are as follows:

DatetimeIndex(['2005-11-23', '2010-12-31', 'NaT'], dtype='datetime64[ns], freq=None)

SQL Operations in Pandas Concatenation in Pandas

Pandas tutorial

Join in Pandas

Concatenation object

print(one.append(two))

Time series

Get the current time

Create a timestamp

Creation time range

Change Time Frequency

Convert to Timestamp