排名前20位的查询
我正在创建一个报告结构,我需要为每个独特的公司 - 区域输出前20天的汇总统计信息。 我已经完成了这项任务,但觉得我的代码过于复杂,我正在请求帮助优化它。
我有2个表参与这个过程。 第一个列出了所有可能的公司 - 区域 - 组 - 子 - 组。 第二个是小组 - 小组的小时统计。
SQL小提琴链接:http://sqlfiddle.com/#!9/29a7b/1
注意 :当前在我的SQL小提琴上SELECT command denied to user '<user>'@'<ip>' for table 'table_stats'
错误的SELECT command denied to user '<user>'@'<ip>' for table 'table_stats'
的SELECT command denied to user '<user>'@'<ip>' for table 'table_stats'
会感谢帮助解决此问题。
table_companies声明和虚拟数据:
CREATE TABLE `table_companies` (
`pk_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`company` varchar(45) NOT NULL,
`region` varchar(45) NOT NULL,
`group` varchar(45) NOT NULL,
`subgroup` varchar(45) NOT NULL,
PRIMARY KEY (`pk_id`),
UNIQUE KEY `pk_id_id_UNIQUE` (`pk_id`)
);
INSERT INTO table_companies
(`pk_id`, `company`, `region`, `group`, `subgroup`)
VALUES
(1, 'company1', 'region1', 'group1', 'subgroup1'),
(2, 'company1', 'region1', 'group1', 'subgroup2'),
(3, 'company1', 'region2', 'group2', 'subgroup3'),
(4, 'company1', 'region3', 'group3', 'subgroup4'),
(5, 'company2', 'region1', 'group4', 'subgroup5'),
(6, 'company2', 'region3', 'group5', 'subgroup6'),
(7, 'company2', 'region3', 'group6', 'subgroup7'),
(8, 'company2', 'region4', 'group7', 'subgroup8'),
(9, 'company2', 'region5', 'group8', 'subgroup9'),
(10, 'company3', 'region6', 'group9', 'subgroup10'),
(11, 'company3', 'region7', 'group10', 'subgroup11'),
(12, 'company3', 'region8', 'group11', 'subgroup12'),
(13, 'company4', 'region9', 'group12', 'subgroup13'),
(14, 'company4', 'region10', 'group13', 'subgroup14'),
(15, 'company5', 'region11', 'group14', 'subgroup15'),
(16, 'company5', 'region12', 'group15', 'subgroup16')
;
table_stats声明:
简化为仅包含每天只有1组 - 子组的几个小时。
CREATE TABLE `table_stats` (
`pk_id` int(10) unsigned NOT NULL,
`date_time` datetime NOT NULL,
`group` varchar(45) NOT NULL,
`subgroup` varchar(45) NOT NULL,
`stat` int(10) unsigned NOT NULL,
PRIMARY KEY (`pk_id`),
UNIQUE KEY `pk_id_UNIQUE` (`pk_id`),
UNIQUE KEY `om_unique` (`date_time`,`group`,`subgroup`)
);
INSERT INTO table_stats
(`pk_id`, `date_time`, `group`, `subgroup`, `stat`)
VALUES
(1, '2015-12-01 06:00:00', 'group9', 'subgroup10', 14),
(2, '2015-12-01 12:00:00', 'group9', 'subgroup10', 14),
(3, '2015-12-02 06:00:00', 'group9', 'subgroup10', 2),
(4, '2015-12-02 12:00:00', 'group9', 'subgroup10', 51),
(5, '2015-12-03 06:00:00', 'group9', 'subgroup10', 30),
(6, '2015-12-03 12:00:00', 'group9', 'subgroup10', 6),
(7, '2015-12-04 06:00:00', 'group9', 'subgroup10', 9),
(8, '2015-12-04 12:00:00', 'group9', 'subgroup10', 77),
(9, '2015-12-05 06:00:00', 'group9', 'subgroup10', 70),
(10, '2015-12-05 12:00:00', 'group9', 'subgroup10', 7),
(11, '2015-12-06 06:00:00', 'group9', 'subgroup10', 38),
(12, '2015-12-06 12:00:00', 'group9', 'subgroup10', 5),
(13, '2015-12-07 06:00:00', 'group9', 'subgroup10', 86),
(14, '2015-12-07 12:00:00', 'group9', 'subgroup10', 73),
(15, '2015-12-08 06:00:00', 'group9', 'subgroup10', 45),
(16, '2015-12-08 12:00:00', 'group9', 'subgroup10', 14),
(17, '2015-12-09 06:00:00', 'group9', 'subgroup10', 66),
(18, '2015-12-09 12:00:00', 'group9', 'subgroup10', 38),
(19, '2015-12-10 06:00:00', 'group9', 'subgroup10', 12),
(20, '2015-12-10 12:00:00', 'group9', 'subgroup10', 77),
(21, '2015-12-11 06:00:00', 'group9', 'subgroup10', 21),
(22, '2015-12-11 12:00:00', 'group9', 'subgroup10', 18),
(23, '2015-12-12 06:00:00', 'group9', 'subgroup10', 28),
(24, '2015-12-12 12:00:00', 'group9', 'subgroup10', 74),
(25, '2015-12-13 06:00:00', 'group9', 'subgroup10', 20),
(26, '2015-12-13 12:00:00', 'group9', 'subgroup10', 37),
(27, '2015-12-14 06:00:00', 'group9', 'subgroup10', 66),
(28, '2015-12-14 12:00:00', 'group9', 'subgroup10', 59),
(29, '2015-12-15 06:00:00', 'group9', 'subgroup10', 26),
(30, '2015-12-15 12:00:00', 'group9', 'subgroup10', 0),
(31, '2015-12-16 06:00:00', 'group9', 'subgroup10', 77),
(32, '2015-12-16 12:00:00', 'group9', 'subgroup10', 31),
(33, '2015-12-17 06:00:00', 'group9', 'subgroup10', 59),
(34, '2015-12-17 12:00:00', 'group9', 'subgroup10', 71),
(35, '2015-12-18 06:00:00', 'group9', 'subgroup10', 7),
(36, '2015-12-18 12:00:00', 'group9', 'subgroup10', 73),
(37, '2015-12-19 06:00:00', 'group9', 'subgroup10', 72),
(38, '2015-12-19 12:00:00', 'group9', 'subgroup10', 28),
(39, '2015-12-20 06:00:00', 'group9', 'subgroup10', 50),
(40, '2015-12-20 12:00:00', 'group9', 'subgroup10', 11),
(41, '2015-12-21 06:00:00', 'group9', 'subgroup10', 71),
(42, '2015-12-21 12:00:00', 'group9', 'subgroup10', 4),
(43, '2015-12-22 06:00:00', 'group9', 'subgroup10', 78),
(44, '2015-12-22 12:00:00', 'group9', 'subgroup10', 69),
(45, '2015-12-23 06:00:00', 'group9', 'subgroup10', 83),
(46, '2015-12-23 12:00:00', 'group9', 'subgroup10', 55),
(47, '2015-12-24 06:00:00', 'group9', 'subgroup10', 71),
(48, '2015-12-24 12:00:00', 'group9', 'subgroup10', 20),
(49, '2015-12-25 06:00:00', 'group9', 'subgroup10', 90),
(50, '2015-12-25 12:00:00', 'group9', 'subgroup10', 26),
(51, '2015-12-26 06:00:00', 'group9', 'subgroup10', 1),
(52, '2015-12-26 12:00:00', 'group9', 'subgroup10', 73),
(53, '2015-12-27 06:00:00', 'group9', 'subgroup10', 4),
(54, '2015-12-27 12:00:00', 'group9', 'subgroup10', 18),
(55, '2015-12-28 06:00:00', 'group9', 'subgroup10', 4),
(56, '2015-12-28 12:00:00', 'group9', 'subgroup10', 30),
(57, '2015-12-29 06:00:00', 'group9', 'subgroup10', 56),
(58, '2015-12-29 12:00:00', 'group9', 'subgroup10', 53),
(59, '2015-12-30 06:00:00', 'group9', 'subgroup10', 33),
(60, '2015-12-31 12:00:00', 'group9', 'subgroup10', 8)
;
查询优化:
SELECT * FROM
(
SELECT t3.company,t3.region,t3.day, t3.day_stat,COUNT(*) as rank
FROM
(
SELECT t2.company,t2.region,DAY(t1.date_time) as day,SUM(t1.stat) as day_stat
FROM schema1.table_stats t1
INNER JOIN table_companies t2
ON t1.group=t2.group AND t1.subgroup=t2.subgroup
WHERE
MONTH(t1.date_time)=12 AND
YEAR(t1.date_time)=2015
group by t2.company,t2.region,DAY(t1.date_time)
ORDER BY t2.company,t2.region,day_stat DESC
) t3
JOIN
(
SELECT t2.company,t2.region,DAY(t1.date_time) as day,SUM(t1.stat) as day_stat
FROM schema1.table_stats t1
INNER JOIN table_companies t2
ON t1.group=t2.group AND t1.subgroup=t2.subgroup
WHERE
MONTH(t1.date_time)=12 AND
YEAR(t1.date_time)=2015
group by t2.company,t2.region,DAY(t1.date_time)
ORDER BY t2.company,t2.region,day_stat DESC
) t4
ON
t4.day_stat >= t3.day_stat AND
t4.company = t3.company AND
t4.region = t3.region
GROUP BY t3.company,t3.region,t3.day_stat
ORDER BY t3.company,t3.region,rank
) t5
WHERE t5.rank<=20
;
查询摘要:从两个最深的子查询开始,通过加入这两个表,按公司,地区和日期对stat
进行分组和汇总。 这也是限制月份和年份的地方。 然后它将这个结果加入到它自己的副本中以便能够生成排名。 每个子组的最后选择限制结果为前20位。
预期结果:
作为SQL声明呈现的道歉
INSERT INTO results
(`company`, `region`, `day`, `day_stat`, `rank`)
VALUES
('company3', 'region6', 7, 159, 1),
('company3', 'region6', 22, 147, 2),
('company3', 'region6', 23, 138, 3),
('company3', 'region6', 17, 130, 4),
('company3', 'region6', 14, 125, 5),
('company3', 'region6', 25, 116, 6),
('company3', 'region6', 29, 109, 7),
('company3', 'region6', 16, 108, 8),
('company3', 'region6', 9, 104, 9),
('company3', 'region6', 12, 102, 10),
('company3', 'region6', 19, 100, 11),
('company3', 'region6', 24, 91, 12),
('company3', 'region6', 10, 89, 13),
('company3', 'region6', 4, 86, 14),
('company3', 'region6', 18, 80, 15),
('company3', 'region6', 5, 77, 16),
('company3', 'region6', 21, 75, 17),
('company3', 'region6', 26, 74, 18),
('company3', 'region6', 20, 61, 19),
('company3', 'region6', 8, 59, 20)
;
tl;博士:对长期职位道歉。 要求优化http://sqlfiddle.com/#!9/29a7b/1。
我所做的修改:
group,subgroup
table_companies
表中添加了一个复合索引 group, subgroup
table_stats
表中添加了一个复合索引 修改的查询:
SELECT
C.company,
C.region,
DAY(S.date_time) day,
SUM(S.stat) day_stat
FROM table_companies C
INNER JOIN table_stats S
ON C.`group` = S.`group` AND C.subgroup = S.subgroup
WHERE MONTH(S.date_time) = 12 AND YEAR(S.date_time) = 2015
GROUP BY C.company, C.region, DAY(S.date_time)
ORDER BY day_stat DESC
LIMIT 20;
工作演示
结果集中没有等级列。 由于结果按降序排序,因此您可以将结果集中行的位置隐式视为排名。 尽管如此,如果你真的需要排名列,那么这里是一个工作演示
综合指数( table_companies)
:
ALTER TABLE `table_companies` ADD INDEX `idx_table_compnaies_group_subgroup` (
`group`,
`subgroup`
);
组合索引( table_stats)
:
ALTER TABLE `table_stats` ADD INDEX `idx_table_stats_group_subgroup` (
`group`,
`subgroup`
);
解释结果:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE S ALL idx_table_compnaies_group_subgroup 60 Using where; Using temporary; Using filesort
1 SIMPLE C ref idx_table_companies_group_subgroup idx_table_companies_group_subgroup 57 schema1.S.group,schema1.S.subgroup 1 Using index condition
好消息是MySQL可以使用这些indexes
(因为这些indexes
是可能的 )。 虽然它显示ALL
类型为table_companies
。 我只能说这是一小组数据。 你不能根据一小组数据来判断性能。
更多:
我猜你在这些表中有主键。 如果你没有任何创建。
编辑:
SELECT
C.company,
C.region,
tt.day,
tt.total AS day_stat,
tt.rank
FROM table_companies C
INNER JOIN
(
SELECT
t.*,
IF(t.businessUnit = @sameBusinessUnit, @rn := @rn + 1, @rn := 1) AS rank,
@sameBusinessUnit := t.businessUnit
FROM
(
SELECT
S1.`group`,
S1.subgroup,
CONCAT(S1.`group`,S1.subgroup) AS businessUnit,
DAY(S1.date_time) AS day,
SUM(S1.stat) total
FROM table_stats S1
GROUP BY S1.group,S1.subgroup,DAY(S1.date_time)
ORDER BY total DESC
)AS t
CROSS JOIN (SELECT @rn := 1, @sameBusinessUnit := '') var
) AS tt
ON C.`group`=tt.`group` AND C.subgroup = tt.subgroup
WHERE tt.rank <= 20
ORDER BY tt.`group`,tt.`subgroup`,tt.rank;
工作演示(2.0版)
只需包含一个用于组的索引,以便连接变得更加高效
CREATE TABLE table_companies
(`pk_id` int, `company` varchar(8),
`region` varchar(8), `group` varchar(7), `subgroup` varchar(10),
PRIMARY KEY (`pk_id`),
UNIQUE KEY `pk_id_id_UNIQUE` (`pk_id`),
INDEX idx_group (`group`, `subgroup`)
)
;
链接地址: http://www.djcxy.com/p/62459.html
上一篇: Top 20 Group Ranking Query
下一篇: Query optimization in MYSQL for a query with sub queries?