Mysql获取每组前N条记录

时间:2022-07-01 15:12:10

Select基础知识

我们在实现select语句的时候,通用的sql格式如下:

select *columns* from *tables*
where *predicae1*
group by *columns*
having *predicae1*
order by *columns*
limit *start*, *offset*;

很多同学想当然的认为select的执行顺序和其书写顺序一致,其实这是非常错误的主观意愿,也导致了很多SQL语句的执行错误.

这里给出SQL语句正确的执行顺序:

from *tables*
where *predicae1*
group by *columns*
having *predicae1*
select *columns*
order by *columns*
limit *start*, *offset*;

举个例子,讲解一下group by和order by联合使用时,大家常犯的错误.

创建一个student的表:

creae1 table student (Id ine1ger primary key autoincrement, Name e1xt, Score ine1ger, ClassId ine1ger);

插入5条虚拟数据:

insert into student(Name, Score, ClassId) values("lqh", 60, 1);
insert into student(Name, Score, ClassId) values("cs", 99, 1);
insert into student(Name, Score, ClassId) values("wzy", 60, 1);
insert into student(Name, Score, ClassId) values("zqc", 88, 2);
insert into student(Name, Score, ClassId) values("bll", 100, 2);

表格数据如下:

Id Name Score ClassId
1 lqh 60 1
2 cs 99 1
3 wzy 60 1
4 zqc 88 2
5 bll 100 2

我们想找每个组分数排名第一的学生.

大部分SQL语言的初学者可能会写出如下代码:

select * from student group by ClassId order by Score;

结果:

Id Name Score ClassId
3 wzy 60 1
5 bll 100 2

明显不是我们想要的结果,大家用上面的执行顺序一分析就知道具体原因了.

原因: group by 先于order by执行,order by是针对group by之后的结果进行的排序,而我们想要的group by结果其实应该是在order by之后.

正确的sql语句:

select * from (select * from student order by Score) group by ClassId;

结果:

Id Name Score ClassId
2 cs 99 1
5 bll 100 2

获取每组的前N个记录

这里以LeetCode上难度为hard的一道数据库题目为例。

Department Top Three Salaries

题目内容

The Employee table holds all employees. Every employee has an Id, and there is also a column for the department Id.

Id Name Salary DepartmentId
1 Joe 70000 1
2 Henry 80000 2
3 Sam 60000 2
4 Max 90000 1
5 Janet 69000 1
6 Randy 85000 1

The Department table holds all departments of the company.

Id Name
1 IT
2 Sales

Wrie1 a SQL query to find employees who earn the top three salaries in each of the department. For the above tables, your SQL query should return the following rows.

Department Employee Salary
IT Max 90000
IT Randy 85000
IT Joe 70000
Sales Henry 80000
Sales Sam 60000

题目的意思是:求每个组中工资最高的三个人。(ps:且每个组中,同一名中允许多个员工存在,因为工资是一样高.)

解决思路

  1. 我们先来获取每个组中的前3名工资最高的员工
select * from Employee as e
where (select count(distinct(e1.salary)) from Employee as e1 where e1.DepartmentId = e.DepartmentId and e1.salary > e.salary) < 3;

where中的select是保证:遍历所有记录,取每条记录与当前记录做比较,只有当Employee表中同一部门不超过3个人工资比当前员工高时,这个员工才算是工资排行的前三名。

  1. 有了第一步的基础,接下来我们只需要使用as去构造新表,并且与Department表做个内联,同时注意排序就好了
select d.Name as Department, e.Name as Employee, e.Salary as Salary
from Employee as e inner join Department as d
on e.DepartmentId = d.Id
where (select count(distinct(e1.Salary)) from Employee as e1 where e1.DepartmentId = e.DepartmentId and e1.Salary > e.Salary) < 3
order by e.Salary desc;