English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Grouping and Group Summary as Well as Total SQL Statements (Slightly Organized)

Today I saw this article and felt it was quite a lot, but it was quite messy and difficult to organize. The editor of the Naihua tutorial simply organized it, hoping everyone can manage to read it

SQL statement for group summary and total count after grouping
 
1“)Get the group summary and total count at one time

SELECT 分组字段 FROM 表

compute sum(COUNT(*))

2“) Group summary”1:

SELECT COUNT(*)
FROM (SELECT 分组字段 FROM 表

,

 
3“) Group summary”2:

SELECT COUNT(*)
FROM (SELECT distinct 分组字段 FROM 表)alias

4
 
Examples1: Group summary

SELECT JSSKQK_JGH FROM SJ_JSSKQK WHERE JSSKQK_JGH IN (SELECT JSJBXX_JGH FROM SJ_JSJBXX WHERE JSJBXX_JSLXM1=1) GROUP BY JSSKQK_JGH HAVING ((SUM(JSSKQK_SSKCXS1) /40)>5) 

The above statement can meet the requirement of grouping. Suppose after execution, there is3How can I calculate this COUNT value?63;

select count(*) from
(
 SELECT JSSKQK_JGH 
 FROM SJ_JSSKQK 
 WHERE JSSKQK_JGH IN (SELECT JSJBXX_JGH FROM SJ_JSJBXX WHERE JSJBXX_JSLXM1=1) 
 GROUP BY JSSKQK_JGH HAVING ((SUM(JSSKQK_SSKCXS1) /40)>5) 
) t

Examples2: [PL/SQL] How to get the record with the largest date in the group after grouping

TABLE: A
A B C D
1        2001/01/01                        1        1
1        2001/12/12                        2        2
3        2002/01/01                        3        3
3        2003/12/12                        4        4

Grouped by column A, how can I get the data with the latest time in each group?

1        2001/12/12                        2        2
3        2003/12/12                        4        4

My clumsy method:

SELECT *
FROM A
WHERE (A,B) IN(
SELECT A,MAX(B)
FROM A
GROUP BY A
)

Is there a better way?

1,select * from a out
where b = (select max(b) from a in
                         where in.a = out.a)

2,Select * from
(select a, row_number() over (partition by a
order by b desc) rn
from a)
where rn=1

3,Select a, b,c,d from
(select a, b,c,d,row_number() over (partition by a
order by b desc) rn
from a)
where rn=1

4,select A,B,C,D from test

  where rowid in
  (
     select rd from
     (
     select rowid rd ,rank() over(partition A order by B desc)rk from test
     ) where rk=1
    
    
  )
  )

Examples3:The method of grouping SQL statements to obtain the first data of the record
Using the Northwind database

Firstly, query the Employees table

Query result:

There is only5A city

Use ROW_NUMBER() OVER(PARTITION BY COL1 ORDER BY COL2) first group by Note: according to COL1Grouping, and within the group, sort according to COL2Sorting, and the value calculated by this function represents the sequence number after sorting within each group (continuous and unique within the group).

The SQL statement is:

select EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,City,ROW_NUMBER() over(partition by City order by EmployeeID) as new_index
from Employees

Execution result diagram:

As can be seen, it is grouped by City and sorted by EmployeeID.

Select the first record in the group

Execute statement:

select * from
(select EmployeeID,LastName,FirstName,Title,TitleOfCourtesy,City,ROW_NUMBER() over(partition by City order by EmployeeID) as new_index
from Employees) a where a.new_index=1

Execution result diagram:

Examples4How to get the first record of each group after getting the grouping result
Eric   red   20
eric   blue  30
andy red   10
andy  blue  5

For example, only get the records in bold.

1,declare @fTable table (fName varchar(10), fColor varchar(10), fOrder int)
 
insert into @fTable values('Eric', 'red', 20)
insert into @fTable values('eric', 'blue', 30)
insert into @fTable values('andy', 'red', 10)
insert into @fTable values('andy', 'blue', 5)
 
-- Only get red
select * from @fTable where fColor = 'red'
-- Get one record for each fColor (in ascending order of fOrder)
select * from @fTable A where fName = (select top 1 fName from @fTable where fColor = A.fColor order by fOrder )
-- Get one record for each fColor (in reverse order of fOrder)
select * from @fTable A where fName = (select top 1 fName from @fTable where fColor = A.fColor order by fOrder desc)
 
2,SQL2005The above version
select * from (select *,row=row_number()over(partition by Color order by Color) from table1)t where row=1 and color='xx'--Add condition

SQL2000 using top 1

Examples5One SQL statement to handle grouping and limiting the number of records in each group
 
If I want to get such a result set: grouping and limiting the number of records in each group, can it be done with one SQL statement?

For example, I want to find the top students in each subject of the final exam3How to write it with only one SQL statement?

Table [TScore] structure

code Student ID char
subject Subject int
score Score int

It can be written like this:

    SELECT [code]
        ,[subject]
        ,[score]
    FROM (
        SELECT *
        ,RANK() OVER(PARTITION BY subject ORDER BY score DESC) AS Row
        FROM TScore
    ) AS a
    WHERE Row <= 3 ;

Examples6: SQL to get the first record of each group

SQL query the following pseudo data to get the records in bold
ID,Name,ItemID,Price,CreatedOn
1 a 1 10.00 xxx1
2 a 1 12.00 xxx2
3 b 1 9.00 xxx1
4 b 1 11.50 xxx2
5 c 1 20.00 xxx1
6 a 2 21.00 xxx1
7 a 2 23.00 xxx2
8 b 2 35.00 xxx1
9 c 2 31.00 xxx1
10 c 2 30.50 xxx2
 
Get the first record of each group, when ItemID has multiple records, select the highest Price
 
--sql2000
select *
from tbname k
where not exists(select * from tbname where
 name=k.name and ITemID=K.ITemID and k.price<price
)
--sql2005
select ID,Name,ItemID,Price,CreatedOnfrom (select *,rn=ROW_NUMBER()over(PARTITION by name,ITemID order by price desc) from tb ) kwhere k.rn=1

Examples7: SQL statement to take the first record after grouping
Share

The table structure is as follows:  
  Field A, B, C  
  the value of a1, b1, c1  
            a2, b2, c2  
            a2, b3, c3  
            a3, b4, c4  
            a3, b5, c5  

To get the desired result set, group by field A and take the first record of each group, as follows:  
            A, B, C  
  the value of a1, b1, c1       --a1the first record of the group.  
            a2, b2, c2       --a2the first record of the group.  
            a3, b4, c4       --a3the first record of the group.

select   *   from table tem where c=(select top   1   c from table where a=tem.a)

The existing data table call is as follows:  
   
  zj th bj  
  -------------   --------   -------------  
  03106666666 00001 03101111111  
  13711111111 00001 031122222222  
  03108898888 950000  
  031177778777 950000  
  031155955555 00001 031187888876  
   
  Note: if th is like950000, then bj is empty, th is like 00001then bj is not empty.  
   
  1and bj grouping  
  select substr(bj,1,4) as area code, count(*) as total calls from call  
  group by substr(bj,1,4);  
  执行结果  
   
  区号                         呼叫总量  
  ------------     --------------  
  0310                           1  
  0311                           2  
                                    2              
   
  2、zj分组,条件是th为950000的记录  
  select   substr(zj,1,4) as area code, count(*) as total calls from call  
  where th=950000  
  group by substr(zj,1,4);  
  执行结果:  
   
  区号                         呼叫总量  
  ------------     --------------  
  0310                           1  
  0311                           1  
   
  能否有一个语句就能实现如下结果:  
   
  区号                         呼叫总量  
  ------------     --------------  
  0310                           2  
  0311                           3  
   
  Note: To get the result is1the corresponding rows plus2the corresponding rows.

union together and sum  
  select area code, sum(total calls) from    
  (select substr(bj,1,4) as area code, count(*) as total calls from call  
  group by substr(bj,1,4))  
  union all  
  (select substr(zj,1,4) as area code, count(*) as total calls from call  
  where th=950000  
  group by substr(zj,1,4))  
  group by area code;

This should run in Oracle

select    
          decode(th,'950000',substr(zj,1,4), substr(bj,1,4)) as area code  
          count(*) as total calls    
from    
          call  
group by  
          decode(th'950000',substr(zj,1,4), substr(bj,1,4))

decode(condition, value1, translated value1, value2, translated value2,...value n, translated value n, default value)

The meaning of this function is as follows:

IF condition=value1 THEN
    RETURN(translated value)1)
ELSIF condition=value2 THEN
    RETURN(translated value)2)
    ......
ELSIF condition=value n THEN
    RETURN(translated value n)

ELSE
    RETURN(default value)
END IF

 

Examples8in SQL Server2005/2008to group the records and get the first N records of each group
Assuming there is a table, the SQL statement is as follows:
  
CREATE TABLE [dbo].[scan](
    [km] [int] NULL,
    [kh] [int] NULL,
    [cj] [int] NULL
) ON [PRIMARY]

    where km is subject number, kh is candidate number, cj is score, and now we group by km and kh, and get the first2records (sorted by cj from high to low). The basic idea is to add a serial number column for each group and then use where to select the serial number less than or equal to2The SQL statement is as follows:
select * from
(
    select a.km, a.kh, cj, row_number() over(partition by a.km order by a.km, a.cj desc) n
    from
        from (select km,kh,SUM(cj) cj from scan group by km,kh) a
) b where n<=2  order by km, cj desc

The final result set is as shown in the figure below.

Examples9How to implement the SQL statement to group and take the top N records?
How to write the SQL statement to group by field B in table A, sort by field C and query the top three records in each group, and require the query results to include all fields? Although the following SQL statement can be implemented, it takes too long due to a large amount of data. Is there a method to group and take records without table join? Thank you!
select *
from tableA as t1
where principal in(
select top 3 principal
from tableA as t2
where t1.B=t2.B
order by t2.C)

Note (hidden note)
Answer1
Author: Zou Jian

select id=identity(int,1,1),b, principal into # from tableA order by B,C

select a.*
from tableA a, # b,(select id1=min(id),id2=min(id)+2 from # group by b)c
where a.principal=b.principal
and b.id between c.id1 and c.id2

drop table #

Answer2
Author: aierong

Find the first in each group2How many methods do you have? (MS SQL2000)

create table abc(
i nvarchar(10,
ii int,
iii int,
iiii int,
price money)
Go
insert into abc
select 'b',1,2,1,11
union all
select 'b',211,2,1,211
union all
select 'a',21,2,1,311
union all
select 'd',41,42,1,411
union all
select 'd',41,42,1,511
union all
select 'd',41,42,1,611
union all
select 'e',1,2,1,11
union all
select 'e',71,2,1,31
union all
select 'e',61,2,1,911
union all
select 'e',771,2,1,1
GO

The result required is:
Group by i, find the first record with the maximum price in each group2record

i ii iii iiii price
---------- ----------- ----------- ----------- ---------------------
a 21 2 1 311.0000
b 1 2 1 11.0000
b 211 2 1 211.0000
d 41 42 1 511.0000
d 41 42 1 611.0000
e 71 2 1 31.0000
e 61 2 1 911.0000

1.
select *
from abc a
where (
select counthttp://dev1.haocang.com:8080/kb/images/icons/emoticons/star_yellow.gif from abc b
where a.i=b.i and b.price>a.price<2
order by i,price

Connection query, judge the number

2.
select i,ii,iii,iiii,price
from (
select (select isnull(sum(1),0)+1 from abc b where a.i=b.i and a.price<b.price) ids,*
from abc a) tem
where ids<3
order by i,price

Generate an internal table, sort each group in the internal table tem, and put the sorting number into the new column ids

3.
declare @looptime int
declare @count int
declare @i nvarchar(10)
/Define the table variable @abc with the same column types as table ABC/
declare @abc table(
i nvarchar(10,
ii int,
iii int,
iiii int,
price money)
declare @tem table(
ids int identity,
class nvarchar(10))
/Query all groups in table ABC and temporarily store them in the table variable @tem/
insert into @tem(class)
select i
from abc
group by i
/The number of rows in the table variable @tem is calculated/
select @count=@@rowcount
/The loop variable @looptime is assigned an initial value =1/
select @looptime=1
while(@looptime<=@count)
begin
/Assign the name of each group to the variable @i/
select @i=class
from @tem
where ids=@looptime
/The first group2Name is inserted into the table variable @abc/
insert into @abc
select top 2 *
from abc
where i=@i
order by price desc
/The loop variable @looptime is incremented1/
select @looptime=@looptime+1
end
/Display the results/
select *
from @abc
order by i,price

4.
Using a cursor to handle
Method and my3Similar methods, you can try them yourself

I have written4Types, I don't know if everyone has any other good methods, you can communicate, thank you.

It was used today, and this method can solve the problem of deleting duplicate records

Of course, the table must have a unique index, carefully look at the following code

Delete From dbo.TB_WorkflowTask a
 WHERE ItemID Not in(select top 1 ItemID from TB_WorkflowTask where TaskName=a.TaskName And EmpID = a.EmpID And BillTypeID =a.BillTypeID And BillID = a.BillID And Status =a.Status AND WFStatus =a.WFStatus )

Note: Only In or Not in can be used, not Exists or Not Exists. Why? Think about it?

Examples10How to get the value of the last record after grouping?
 

It's still confused, let's rearrange it again:

Let's rearrange the format:

There is an existing table Log: Day In Out Current
  2012.4.5 10 0 10
  2012.4.5 0 5 5
  2012.4.6 30 20 15
  2012.4.6 0 3 12

  ………………………………………………

I hope it can be displayed as
,
  2012.4.5 10 5 5
  2012.4.6 30 23 12

SQL code
with tb as(
select [day],sum([in]) as [in],sum(out) as out,sum([in])-sum(out) as [current],rank() over( order by [day]) as row from [log] group by [day]
)
select [day],[in],out,(select sum([current]) from tb b where b.row<=a.row)[current] from tb a
 
 
SQL code
2012.4.5    10    5    5
2012.4.6    30    23    12
 
SQL code
 
--> Test data: [Log]
if object_id('[Log]') is not null drop table [Log]
create table [Log]([Day] date,[In] int,[Out] int,[Current] int)
insert [Log]
select '2012.4.5',10,0,10 union all
select '2012.4.5',0,5,5 union all
select '2012.4.6',30,20,15 union all
select '2012.4.6',0,3,12
 
select
[Day],sum([In]) [In],sum([Out]) [Out],min([Current]) as [Current]
from [Log] group by [Day]
 
/*
Day In Out Current
2012-04-05    10    5    5
2012-04-06    30    23    12
*/
 
Examples11After SQL grouping, a second summary

https://www.oldtoolbag.com/article/106074.htm

Examples12: SQL classification and grouping statistics
You need to understand how to use certain SQL clauses and operators to arrange SQL data for efficient analysis. The following suggestions tell you how to construct statements to obtain the results you want.
Arranging data in a meaningful way may be a challenge. Sometimes you just need to perform a simple classification. Usually, you need to do more processing - grouping to facilitate analysis and summarization. Fortunately, SQL provides a large number of clauses and operators for classification, grouping, and summarization. The following suggestions will help you understand when to classify, when to group, and when and how to summarize. To learn more about each clause and operator, please see
.
#1: Classification sorting
Usually, we do need to sort all the data. The SQL ORDER BY clause arranges the data in alphabetical or numerical order. Therefore, similar data is clearly classified into various groups. However, these groups are only the result of classification, they are not real groups. ORDER BY displays each record, while a group may represent multiple records.
#2: Reduce similar data in the group
The biggest difference between classification and grouping is that classification data displays all records (within any specified standard), while grouped data does not display these records. The GROUP BY clause reduces the similar data in a record. For example, GROUP BY can return a unique list of postal codes from a source file that repeats those values:
SELECT ZIP
FROM Customers
GROUP BY ZIP
Only include those columns that define the group in both the GROUP BY and SELECT column lists. In other words, the SELECT list must match the GROUP list. There is one exception: the SELECT list can include aggregate functions. (While GROUP BY does not support aggregate functions.)
Remember, GROUP BY does not classify the groups that are produced as a result. To sort the groups in alphabetical or numerical order, add an ORDER BY clause (#1). Additionally, you cannot reference a domain with an alias in the GROUP BY clause. Group columns must be in the fundamental data, but they do not have to appear in the result.
#3: Limit data before grouping
You can add a WHERE clause to limit the data grouped by GROUP BY. For example, the following statement only returns the list of postal codes for customers in the Kentucky area.
SELECT ZIP
FROM Customers
WHERE State = 'KY'
GROUP BY ZIP
Remember this is very important: WHERE filters the data before GROUP BY calculates the values of the data.
Like GROUP BY, WHERE does not support aggregate functions.
#4: Return all groups
When you use WHERE to filter data, the groups displayed only show those records you specify. Data that meets the group definition but does not satisfy the clause condition will not appear in the group. Regardless of the WHERE condition, if you want to include all data, add an ALL clause. For example, adding an ALL clause to the previous statement will return all postal code groups, not just those in Kentucky.
SELECT ZIP
FROM Customers
WHERE State = 'KY'
GROUP BY ALL ZIP
In this way, these two clauses may cause conflicts, and you may not use the ALL clause in this way. It is convenient to use the ALL clause when you calculate the value of a column with aggregation. For example, the following statement calculates the number of customers for each Kentucky postal code while displaying other postal code values.
SELECT ZIP, Count(ZIP) AS KYCustomersByZIP
FROM Customers
WHERE State = 'KY'
GROUP BY ALL ZIP
The resulting groups consist of all postal code values from the underlying data. However, the aggregated column (KYCustomerByZIP) is displayed as 0 because there are no other groups except the Kentucky postal code group.
Remote queries do not support GROUP BY ALL.
#5:Limit data after grouping
WHERE clause (#3)Calculate the value of the data before the GROUP BY clause. Use HAVING when you want to limit the data after grouping. Usually, the results obtained with WHERE or HAVING are the same. But remember, these two clauses cannot be interchanged, which is very important. If you have any doubts, here is a guide to application: use WHERE to filter records; use HAVING to filter groups.
Generally, you would use HAVING, using aggregation to calculate the value of a group. For example, the following statement returns a list of postal codes, but this table may not contain all postal codes from the underlying data source:
SELECT ZIP, Count(ZIP) AS CustomersByZIP
FROM Customers
GROUP BY ZIP
HAVING Count(ZIP) = 1
Only those groups with only one customer appear in the results.
#6:Understand WHERE and HAVING in detail
If you are still confused about the usage of WHERE and HAVING, apply the following guidance method:
WHERE appears before GROUP BY; SQL calculates the value of the WHERE clause before it groups the records.
HAVING appears after GROUP BY; SQL calculates the value of the HAVING clause after it groups the records.
#7:Total grouped values by aggregation
Grouped data is helpful for analyzing data, but sometimes you also need information other than the group itself. You can add an aggregate function to total the grouped data. For example, the following statement displays a subtotal for each sort:
SELECT OrderID, Sum(Cost * Quantity) AS OrderTotal
FROM Orders
GROUP BY OrderID
Like other groups, the SELECT and GROUP BY lists must match. The only exception to this rule is including an aggregate in the SELECT clause.
#8:Total the aggregation
You can further total the data by displaying each group's subtotal. The SQL ROLLUP operator displays an additional record for each group, a subtotal. This record is the result of calculating the value of all records in each group using an aggregate function. The following statement totals the OrderTotal column for each combination.
SELECT Customer, OrderNumber, Sum(Cost * Quantity) AS OrderTotal
FROM Orders
GROUP BY Customer, OrderNumber
WITH ROLLUP
one containing20 and25The ROLLUP row for the group of these two OrderTotal values will display the OrderTotal value45. The first value in the ROLLUP result is unique because it calculates the value of all group records. That value is the sum of the entire record set.
ROLLUP does not support the 'DISTINCT' or 'GROUP BY ALL' clauses in aggregate functions.
#9:Total each column
The CUBE operator goes further than ROLLUP, returning the total of each value in each group. The result is similar to ROLLUP, but CUBE includes an additional record for each column in the group. The following statement shows the subtotals for each group and an additional total for each customer.
SELECT Customer, OrderNumber, Sum(Cost * Quantity) AS OrderTotal
FROM Orders
GROUP BY Customer, OrderNumber
WITH CUBE
The totals obtained with CUBE are the most complex. Not only do they complete the aggregation and ROLLUP work, but they also calculate the values of other columns in the defined groups. That is, CUBE totals every possible combination of columns.
CUBE does not support 'GROUP BY ALL'.
#10:Sort the totals
When the result of CUBE is disorganized (usually it is), you can add a GROUPING function as shown below:
SELECT GROUPING(Customer), OrderNumber, Sum(Cost * Quantity) AS OrderTotal
FROM Orders
GROUP BY Customer, OrderNumber
WITH CUBE
Its result includes two additional values for each row.
Value1Indicates that the value on the left is a total value—the result of the 'ROLLUP' or 'CUBE' operator.
Value 0 indicates that the value on the left is a detailed record generated by a primitive 'GROUP BY' clause.

In grouping queries, the 'HAVING' clause can also be used in conjunction to define query conditions.

Use 'GROUP BY' for grouping queries

When using the 'GROUP BY' keyword, the items that can be specified in the 'SELECT' list are limited, and only the following items are allowed in the 'SELECT' statement:

〉The columns that are grouped
〉Returns a worth expression for each group, for example, an aggregate function with a column name as a parameter


The principle of 'GROUP BY' is that all columns after 'SELECT' that do not use aggregate functions must appear after 'GROUP BY' (important)

GROUP BY instance


Example One

Data Table:

Name Subject Score
Zhang San Chinese 80
Zhang San Math 98
Zhang San English 65
Li Si Chinese 70
Li Si Math 80
Li Si English 90

Expected Query Result:

Name Chinese Math English
Zhang San 80 98 65
Li Si 70 80 90

Code
 
CREATE TABLE testScore   
(   
   tid int primary key identity(1,1,   
   tname varchar30) null,   
   ttype varchar10) null,   
   tscor int null  
)   
GO   

---Insert Data   
INSERT INTO testScore VALUES ('Zhang San', 'Chinese',80)   
INSERT INTO testScore VALUES ('Zhang San', 'Math',98)   
INSERT INTO testScore VALUES ('Zhang San', 'English',65)   
INSERT INTO testScore VALUES ('Li Si', 'Chinese',70)   
INSERT INTO testScore VALUES ('Li Si', 'Math',80)   
INSERT INTO testScore VALUES ('Li Si', 'English',90)   


SELECT tname AS 'Name',    
MAX(CASE ttype WHEN 'Chinese' THEN tscor ELSE 0 END) 'Chinese',    
MAX(CASE ttype WHEN 'Math' THEN tscor ELSE 0 END) 'Math',    
MAX(CASE ttype WHEN 'English' THEN tscor ELSE 0 END) 'English'    
FROM testScore    
GROUP BY tname

Example Two


Here is the data: (For clarity, I have not used country codes and have used country names directly as the Primary Key)

Country (country) Population (population)
China 600
United States 100
Canada 100
United Kingdom 200
France 300
Japan 250
Germany 200
Mexico 50
India 250

Based on the population data of this country, count the population of Asia and North America. The result should be as follows.

Continent Population
Asia 1100
North America 250
Other 700

Code

SELECT SUM(population),
    CASE country
        WHEN 'China'   THEN 'Asia'
        WHEN 'India'   THEN 'Asia'
        WHEN 'Japan'   THEN 'Asia'
        WHEN 'United States'   THEN 'North America'
        WHEN 'Canada' THEN 'North America'
        WHEN 'Mexico' THEN 'North America'
    ELSE 'Other' END
FROM  Table_A
GROUP BY CASE country
        WHEN 'China'   THEN 'Asia'
        WHEN 'India'   THEN 'Asia'
        WHEN 'Japan'   THEN 'Asia'
        WHEN 'United States'   THEN 'North America'
        WHEN 'Canada' THEN 'North America'
        WHEN 'Mexico' THEN 'North America'
    ELSE 'Other' END;

Similarly, we can also use this method to judge the salary level and count the number of people in each level. The SQL code is as follows;

SELECT
    CASE WHEN salary <= 500 THEN '1'
       WHEN salary > 500 AND salary <= 600 THEN '2'
       WHEN salary > 600 AND salary <= 800 THEN '3'
       WHEN salary > 800 AND salary <= 1000 THEN '4'
    ELSE NULL END salary_class,
    COUNT(*)
FROM  Table_A
GROUP BY
    CASE WHEN salary <= 500 THEN '1'
       WHEN salary > 500 AND salary <= 600 THEN '2'
       WHEN salary > 600 AND salary <= 800 THEN '3'
       WHEN salary > 800 AND salary <= 1000 THEN '4'
    ELSE NULL END;

Generally, after groupby, it is followed by a column name, but in this example, the grouping is made more powerful through the case statement.

Example 3

There are the following data

Country (country) Gender (sex) Population (population)
China 1 340
China 2 260
United States 1 45
United States 2 55
Canada 1 51
Canada 2 49
United Kingdom 1 40
United Kingdom 2 60

Grouping by country and gender, the results are as follows

Country Male Female
China 340 260
United States 45 55
Canada 51 49
United Kingdom 40 60

Code

SELECT country,
    SUM( CASE WHEN sex = '1' THEN 
           population ELSE 0 END), --Male population
    SUM( CASE WHEN sex = '2' THEN 
           population ELSE 0 END)  --Female population
FROM Table_A
GROUP BY country;

Handling NULL values in the GROUP BY clause
What if there are NULL values in the columns used for grouping in the GROUP BY clause? In SQL, NULL is not equal to NULL (as introduced in the WHERE clause). However, in the GROUP BY clause, all NULL values are grouped together, i.e., they are considered 'equal'.

HAVING clause
The GROUP BY clause groups data based on the selected columns, simply grouping rows that have the same value in the column. However, in practical applications, it is often necessary to remove groups of rows that do not meet the conditions. To achieve this functionality, SQL provides the HAVING clause. The syntax is as follows.

SELECT column, SUM(column)
FROM table
GROUP BY column
HAVING SUM(column) condition value

Note: HAVING is usually used with the GROUP BY clause. Of course, the SUM() function in the syntax can also be any other aggregate function. The DBMS applies the search conditions in the HAVING clause to the row groups generated by the GROUP BY clause, and if the row group does not meet the search conditions, it is deleted from the result table.

Application of HAVING clause
Query the department and the number of teachers from the TEACHER table who have at least two teachers.

Implementation code:

SELECT DNAME, COUNT(*) AS num_teacher
FROM TEACHER
GROUP BY DNAME
HAVING COUNT(*)>=2

Difference between HAVING clause and WHERE clause

The similarity between the HAVING clause and the WHERE clause is that it also defines search conditions. However, unlike the WHERE clause, the HAVING clause is related to groups rather than individual rows.
1If the GROUP BY clause is specified, the search conditions defined by the HAVING clause will act on the groups created by this GROUP BY clause.
2If the WHERE clause is specified but the GROUP BY clause is not specified, the search conditions defined by the HAVING clause will act on the output of the WHERE clause and treat this output as a group.
3If neither the GROUP BY clause nor the WHERE clause is specified, the search conditions defined by the HAVING clause will act on the output of the FROM clause and treat this output as a group.
4In the SELECT statement, the execution order of the WHERE and HAVING clauses is different. In this book's5.1.2From the execution steps of the SELECT statement introduced in this section, it can be known that the WHERE clause can only accept input from the FROM clause, while the HAVING clause can accept input from the GROUP BY clause, WHERE clause, and FROM clause.

You May Also Like