改进PostgreSQL查询性能
在我的服务器上运行这个查询时,它非常慢,我不明白为什么。 任何人都可以帮我弄明白吗?
查询:
SELECT
"t_dat"."t_year" AS "c0",
"t_dat"."t_month" AS "c1",
"t_dat"."t_week" AS "c2",
"t_dat"."t_day" AS "c3",
"t_purs"."p_id" AS "c4",
sum("t_purs"."days") AS "m0",
sum("t_purs"."timecreated") AS "m1"
FROM "t_dat", "t_purs"
WHERE "t_purs"."created" = "t_dat"."t_key"
AND "t_dat"."t_year" = 2013
AND "t_dat"."t_month" = 3
AND "t_dat"."t_week" = 9
AND "t_dat"."t_day" IN (1,2)
AND "t_purs"."p_id" IN (
'4','15','18','19','20','29',
'31','35','46','56','72','78')
GROUP BY
"t_dat"."t_year",
"t_dat"."t_month",
"t_dat"."t_week",
"t_dat"."t_day",
"t_purs"."p_id"
解释分析:
HashAggregate (cost=12252.04..12252.04 rows=1 width=28) (actualtime=10212.374..10212.384 rows=10 loops=1) -> Nested Loop (cost=0.00..12252.03 rows=1 width=28) (actual time=3016.006..10212.249 rows=14 loops=1) Join Filter: (t_dat.t_key = t_purs.created) -> Seq Scan on t_dat (cost=0.00..129.90 rows=1 width=20) (actual time=0.745..2.040 rows=48 loops=1) Filter: ((t_day = ANY ('{1,2}'::integer[])) AND (t_year = 2013) AND (t_month = 3) AND (t_week = 9)) -> Seq Scan on t_purs (cost=0.00..12087.49 rows=9900 width=16) (actual time=0.018..201.630 rows=14014 loops=48) Filter: (p_id = ANY ('{4,15,18,19,20,29,31,35,46,56,72,78}'::integer[])) Total runtime: 10212.470 ms
很难说你错过了什么,但如果我是你,我会确保下列指数存在:
CREATE INDEX t_dat_id_date_idx
ON t_dat (t_key, t_year, t_month, t_week, t_day);
对于t_purs
,创建这个索引:
CREATE INDEX t_purs_created_p_id_idx
ON t_purs (created, p_id);
考虑在表格中使用单个列 :
t_date date
而不是(t_year, t_month, t_week, t_day)
。 数据类型date
占用4个字节。 这会使您的表缩小一些,使索引变得更小,更快,并且更容易分组。
可以使用extract()
从日期中轻松快速地提取年,月,周和日。 您的查询可能会看起来像这样,并会更快:
SELECT extract (year FROM t_date) AS c0
,extract (month FROM t_date) AS c1
,extract (week FROM t_date) AS c2
,extract (day FROM t_date) AS c3
,p.p_id AS c4
,sum(p.days) AS m0
,sum(p.timecreated) AS m1
FROM t_dat d
JOIN t_purs p ON p.created = d.t_key
WHERE d.t_date IN ('2013-03-01'::date, '2013-03-02'::date)
AND p.p_id IN (4,15,18,19,20,29,31,35,46,56,72,78)
GROUP BY d.t_date, p.p_id;
对绩效而言更重要的是指数,然后简单地说就是:
CREATE INDEX t_dat_date_idx ON t_dat (t_key, t_date);
或者,根据数据分布情况:
CREATE INDEX t_dat_date_idx ON t_dat (t_date, t_key);
列问题的顺序。 你甚至可以创建两个。
链接地址: http://www.djcxy.com/p/86069.html