排名前20位的查询

我正在创建一个报告结构,我需要为每个独特的公司 - 区域输出前20天的汇总统计信息。 我已经完成了这项任务,但觉得我的代码过于复杂,我正在请求帮助优化它。

我有2个表参与这个过程。 第一个列出了所有可能的公司 - 区域 - 组 - 子 - 组。 第二个是小组 - 小组的小时统计。

SQL小提琴链接:http://sqlfiddle.com/#!9/29a7b/1
注意 :当前在我的SQL小提琴上SELECT command denied to user '<user>'@'<ip>' for table 'table_stats'错误的SELECT command denied to user '<user>'@'<ip>' for table 'table_stats'SELECT command denied to user '<user>'@'<ip>' for table 'table_stats'会感谢帮助解决此问题。

table_companies声明和虚拟数据:

CREATE TABLE `table_companies` (
  `pk_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `company` varchar(45) NOT NULL,
  `region` varchar(45) NOT NULL,
  `group` varchar(45) NOT NULL,
  `subgroup` varchar(45) NOT NULL,
  PRIMARY KEY (`pk_id`),
  UNIQUE KEY `pk_id_id_UNIQUE` (`pk_id`)
);

INSERT INTO table_companies
    (`pk_id`, `company`, `region`, `group`, `subgroup`)
VALUES
    (1, 'company1', 'region1', 'group1', 'subgroup1'),
    (2, 'company1', 'region1', 'group1', 'subgroup2'),
    (3, 'company1', 'region2', 'group2', 'subgroup3'),
    (4, 'company1', 'region3', 'group3', 'subgroup4'),
    (5, 'company2', 'region1', 'group4', 'subgroup5'),
    (6, 'company2', 'region3', 'group5', 'subgroup6'),
    (7, 'company2', 'region3', 'group6', 'subgroup7'),
    (8, 'company2', 'region4', 'group7', 'subgroup8'),
    (9, 'company2', 'region5', 'group8', 'subgroup9'),
    (10, 'company3', 'region6', 'group9', 'subgroup10'),
    (11, 'company3', 'region7', 'group10', 'subgroup11'),
    (12, 'company3', 'region8', 'group11', 'subgroup12'),
    (13, 'company4', 'region9', 'group12', 'subgroup13'),
    (14, 'company4', 'region10', 'group13', 'subgroup14'),
    (15, 'company5', 'region11', 'group14', 'subgroup15'),
    (16, 'company5', 'region12', 'group15', 'subgroup16')
;

table_stats声明:
简化为仅包含每天只有1组 - 子组的几个小时。

CREATE TABLE `table_stats` (
  `pk_id` int(10) unsigned NOT NULL,
  `date_time` datetime NOT NULL,
  `group` varchar(45) NOT NULL,
  `subgroup` varchar(45) NOT NULL,
  `stat` int(10) unsigned NOT NULL,
  PRIMARY KEY (`pk_id`),
  UNIQUE KEY `pk_id_UNIQUE` (`pk_id`),
  UNIQUE KEY `om_unique` (`date_time`,`group`,`subgroup`)
);

INSERT INTO table_stats
    (`pk_id`, `date_time`, `group`, `subgroup`, `stat`)
VALUES
    (1, '2015-12-01 06:00:00', 'group9', 'subgroup10', 14),
    (2, '2015-12-01 12:00:00', 'group9', 'subgroup10', 14),
    (3, '2015-12-02 06:00:00', 'group9', 'subgroup10', 2),
    (4, '2015-12-02 12:00:00', 'group9', 'subgroup10', 51),
    (5, '2015-12-03 06:00:00', 'group9', 'subgroup10', 30),
    (6, '2015-12-03 12:00:00', 'group9', 'subgroup10', 6),
    (7, '2015-12-04 06:00:00', 'group9', 'subgroup10', 9),
    (8, '2015-12-04 12:00:00', 'group9', 'subgroup10', 77),
    (9, '2015-12-05 06:00:00', 'group9', 'subgroup10', 70),
    (10, '2015-12-05 12:00:00', 'group9', 'subgroup10', 7),
    (11, '2015-12-06 06:00:00', 'group9', 'subgroup10', 38),
    (12, '2015-12-06 12:00:00', 'group9', 'subgroup10', 5),
    (13, '2015-12-07 06:00:00', 'group9', 'subgroup10', 86),
    (14, '2015-12-07 12:00:00', 'group9', 'subgroup10', 73),
    (15, '2015-12-08 06:00:00', 'group9', 'subgroup10', 45),
    (16, '2015-12-08 12:00:00', 'group9', 'subgroup10', 14),
    (17, '2015-12-09 06:00:00', 'group9', 'subgroup10', 66),
    (18, '2015-12-09 12:00:00', 'group9', 'subgroup10', 38),
    (19, '2015-12-10 06:00:00', 'group9', 'subgroup10', 12),
    (20, '2015-12-10 12:00:00', 'group9', 'subgroup10', 77),
    (21, '2015-12-11 06:00:00', 'group9', 'subgroup10', 21),
    (22, '2015-12-11 12:00:00', 'group9', 'subgroup10', 18),
    (23, '2015-12-12 06:00:00', 'group9', 'subgroup10', 28),
    (24, '2015-12-12 12:00:00', 'group9', 'subgroup10', 74),
    (25, '2015-12-13 06:00:00', 'group9', 'subgroup10', 20),
    (26, '2015-12-13 12:00:00', 'group9', 'subgroup10', 37),
    (27, '2015-12-14 06:00:00', 'group9', 'subgroup10', 66),
    (28, '2015-12-14 12:00:00', 'group9', 'subgroup10', 59),
    (29, '2015-12-15 06:00:00', 'group9', 'subgroup10', 26),
    (30, '2015-12-15 12:00:00', 'group9', 'subgroup10', 0),
    (31, '2015-12-16 06:00:00', 'group9', 'subgroup10', 77),
    (32, '2015-12-16 12:00:00', 'group9', 'subgroup10', 31),
    (33, '2015-12-17 06:00:00', 'group9', 'subgroup10', 59),
    (34, '2015-12-17 12:00:00', 'group9', 'subgroup10', 71),
    (35, '2015-12-18 06:00:00', 'group9', 'subgroup10', 7),
    (36, '2015-12-18 12:00:00', 'group9', 'subgroup10', 73),
    (37, '2015-12-19 06:00:00', 'group9', 'subgroup10', 72),
    (38, '2015-12-19 12:00:00', 'group9', 'subgroup10', 28),
    (39, '2015-12-20 06:00:00', 'group9', 'subgroup10', 50),
    (40, '2015-12-20 12:00:00', 'group9', 'subgroup10', 11),
    (41, '2015-12-21 06:00:00', 'group9', 'subgroup10', 71),
    (42, '2015-12-21 12:00:00', 'group9', 'subgroup10', 4),
    (43, '2015-12-22 06:00:00', 'group9', 'subgroup10', 78),
    (44, '2015-12-22 12:00:00', 'group9', 'subgroup10', 69),
    (45, '2015-12-23 06:00:00', 'group9', 'subgroup10', 83),
    (46, '2015-12-23 12:00:00', 'group9', 'subgroup10', 55),
    (47, '2015-12-24 06:00:00', 'group9', 'subgroup10', 71),
    (48, '2015-12-24 12:00:00', 'group9', 'subgroup10', 20),
    (49, '2015-12-25 06:00:00', 'group9', 'subgroup10', 90),
    (50, '2015-12-25 12:00:00', 'group9', 'subgroup10', 26),
    (51, '2015-12-26 06:00:00', 'group9', 'subgroup10', 1),
    (52, '2015-12-26 12:00:00', 'group9', 'subgroup10', 73),
    (53, '2015-12-27 06:00:00', 'group9', 'subgroup10', 4),
    (54, '2015-12-27 12:00:00', 'group9', 'subgroup10', 18),
    (55, '2015-12-28 06:00:00', 'group9', 'subgroup10', 4),
    (56, '2015-12-28 12:00:00', 'group9', 'subgroup10', 30),
    (57, '2015-12-29 06:00:00', 'group9', 'subgroup10', 56),
    (58, '2015-12-29 12:00:00', 'group9', 'subgroup10', 53),
    (59, '2015-12-30 06:00:00', 'group9', 'subgroup10', 33),
    (60, '2015-12-31 12:00:00', 'group9', 'subgroup10', 8)
;

查询优化:

SELECT * FROM
    (
    SELECT t3.company,t3.region,t3.day, t3.day_stat,COUNT(*) as rank
    FROM
        (
            SELECT t2.company,t2.region,DAY(t1.date_time) as day,SUM(t1.stat) as day_stat
            FROM schema1.table_stats t1
            INNER JOIN table_companies t2
            ON t1.group=t2.group AND t1.subgroup=t2.subgroup
            WHERE
                MONTH(t1.date_time)=12 AND
                YEAR(t1.date_time)=2015
            group by t2.company,t2.region,DAY(t1.date_time)
            ORDER BY t2.company,t2.region,day_stat DESC
        ) t3
    JOIN
    (
            SELECT t2.company,t2.region,DAY(t1.date_time) as day,SUM(t1.stat) as day_stat
            FROM schema1.table_stats t1
            INNER JOIN table_companies t2
            ON t1.group=t2.group AND t1.subgroup=t2.subgroup
            WHERE
                MONTH(t1.date_time)=12 AND
                YEAR(t1.date_time)=2015
            group by t2.company,t2.region,DAY(t1.date_time)
            ORDER BY t2.company,t2.region,day_stat DESC
        ) t4
    ON
        t4.day_stat >= t3.day_stat AND
        t4.company = t3.company AND
        t4.region = t3.region
    GROUP BY t3.company,t3.region,t3.day_stat
    ORDER BY t3.company,t3.region,rank
    ) t5
WHERE t5.rank<=20
;

查询摘要:从两个最深的子查询开始,通过加入这两个表,按公司,地区和日期对stat进行分组和汇总。 这也是限制月份和年份的地方。 然后它将这个结果加入到它自己的副本中以便能够生成排名。 每个子组的最后选择限制结果为前20位。

预期结果:
作为SQL声明呈现的道歉

INSERT INTO results
    (`company`, `region`, `day`, `day_stat`, `rank`)
VALUES
    ('company3', 'region6', 7, 159, 1),
    ('company3', 'region6', 22, 147, 2),
    ('company3', 'region6', 23, 138, 3),
    ('company3', 'region6', 17, 130, 4),
    ('company3', 'region6', 14, 125, 5),
    ('company3', 'region6', 25, 116, 6),
    ('company3', 'region6', 29, 109, 7),
    ('company3', 'region6', 16, 108, 8),
    ('company3', 'region6', 9, 104, 9),
    ('company3', 'region6', 12, 102, 10),
    ('company3', 'region6', 19, 100, 11),
    ('company3', 'region6', 24, 91, 12),
    ('company3', 'region6', 10, 89, 13),
    ('company3', 'region6', 4, 86, 14),
    ('company3', 'region6', 18, 80, 15),
    ('company3', 'region6', 5, 77, 16),
    ('company3', 'region6', 21, 75, 17),
    ('company3', 'region6', 26, 74, 18),
    ('company3', 'region6', 20, 61, 19),
    ('company3', 'region6', 8, 59, 20)
;

tl;博士:对长期职位道歉。 要求优化http://sqlfiddle.com/#!9/29a7b/1。


我所做的修改:

  • 完全修改您的查询
  • group,subgroup table_companies表中添加了一个复合索引
  • group, subgroup table_stats表中添加了一个复合索引

  • 修改的查询:

    SELECT 
        C.company,
        C.region,
        DAY(S.date_time) day,
        SUM(S.stat) day_stat
    FROM table_companies C
    INNER JOIN table_stats S
    ON C.`group` = S.`group` AND C.subgroup = S.subgroup
    WHERE MONTH(S.date_time) = 12 AND YEAR(S.date_time) = 2015
    GROUP BY C.company, C.region, DAY(S.date_time)
    ORDER BY day_stat DESC
    LIMIT 20;
    

    工作演示

    结果集中没有等级列。 由于结果按降序排序,因此您可以将结果集中行的位置隐式视为排名。 尽管如此,如果你真的需要排名列,那么这里是一个工作演示

    综合指数( table_companies)

    ALTER TABLE `table_companies` ADD INDEX `idx_table_compnaies_group_subgroup` (
        `group`,
        `subgroup`
    );
    

    组合索引( table_stats)

    ALTER TABLE `table_stats` ADD INDEX `idx_table_stats_group_subgroup` (
    `group`,
    `subgroup`
    );
    

    解释结果:

    id  select_type table   type    possible_keys   key key_len ref rows    Extra
    1   SIMPLE  S   ALL idx_table_compnaies_group_subgroup              60  Using where; Using temporary; Using filesort
    1   SIMPLE  C   ref idx_table_companies_group_subgroup  idx_table_companies_group_subgroup  57  schema1.S.group,schema1.S.subgroup  1   Using index condition
    

    好消息是MySQL可以使用这些indexes (因为这些indexes可能的 )。 虽然它显示ALL类型为table_companies 。 我只能说这是一小组数据。 你不能根据一小组数据来判断性能。

    更多:

    我猜你在这些表中有主键。 如果你没有任何创建。

    编辑:

    SELECT 
        C.company,
        C.region,
        tt.day,
        tt.total AS day_stat,
        tt.rank
    FROM table_companies C 
    INNER JOIN 
    (
    SELECT 
    t.*,
    IF(t.businessUnit = @sameBusinessUnit, @rn := @rn + 1, @rn := 1) AS rank,
    @sameBusinessUnit := t.businessUnit
    FROM 
    (
        SELECT 
           S1.`group`,
           S1.subgroup,
           CONCAT(S1.`group`,S1.subgroup) AS businessUnit,
           DAY(S1.date_time) AS day,
           SUM(S1.stat) total
        FROM table_stats S1
        GROUP BY S1.group,S1.subgroup,DAY(S1.date_time)
        ORDER BY total DESC
    )AS t
    CROSS JOIN (SELECT @rn := 1, @sameBusinessUnit := '') var
    ) AS tt
    ON C.`group`=tt.`group` AND C.subgroup = tt.subgroup
    WHERE tt.rank <= 20
    ORDER BY tt.`group`,tt.`subgroup`,tt.rank;
    

    工作演示(2.0版)


    只需包含一个用于组的索引,以便连接变得更加高效

    CREATE TABLE table_companies
        (`pk_id` int, `company` varchar(8), 
         `region` varchar(8), `group` varchar(7), `subgroup` varchar(10),
         PRIMARY KEY (`pk_id`),
         UNIQUE KEY `pk_id_id_UNIQUE` (`pk_id`),  
    
         INDEX idx_group (`group`, `subgroup`)
        )
    ;
    
    链接地址: http://www.djcxy.com/p/62459.html

    上一篇: Top 20 Group Ranking Query

    下一篇: Query optimization in MYSQL for a query with sub queries?