嵌套不在中提高PostgreSQL中的查询性能

我试图将应用程序从MySQL 5.6移植到PostgreSQL 9.2原始应用程序使用我已经设法至少运行的视图,但查询时间太可怕了。

我想知道PostgreSQL中最好的方法来优化“不在”查询。

我的第一个想法是创建一个临时表,但由于这是一个视图,我不认为这是一个选项。

create VIEW ready_ports AS 
    SELECT ports.id AS id, 
           ports.run AS run,
           ports.name AS name, 
           ports.pkgname AS pkgname,
           ports.version AS version,
           ports.description AS description,
           ports.license AS license,
           ports.www AS www, 
           ports.status AS status, 
           ports.updated AS updated,
          (SELECT count(0) AS COUNT 
           FROM depends 
           WHERE depends.dependency = ports.id) AS priority 
    FROM ports 
    WHERE (ports.status = 'untested' and 
          (not(ports.id in 
                 (SELECT locks.port AS port 
                  FROM locks 
                  WHERE locks.port = ports.id)
               )
          ) and 
          (
             (not(ports.id in (SELECT depends.port AS port 
                               FROM depends 
                               WHERE depends.port = ports.id))) or 
             (not(ports.id in 
                  (SELECT depends.port AS port 
                   FROM depends
                   WHERE ((not(depends.dependency in 
                     (SELECT ports.id AS dep_id 
                      FROM ports   
                      WHERE (ports.id = depends.dependency 
                             and (ports.status = 'pass' 
                                  or ports.status = 'warn')
                             )
                     ))) or 
    depends.dependency in 
    (SELECT locks.port AS port 
     FROM locks 
     WHERE locks.port = ports.id)))))))
ORDER BY priority desc
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Sort  (cost=367498265655.68..367498265763.29 rows=43047 width=136)
   Sort Key: ((SubPlan 1))
   ->  Index Scan using ports_1_idx on ports  (cost=0.00..367498259398.93 rows=43047 width=136)
         Index Cond: ((status)::text = 'untested'::text)
         Filter: ((NOT (SubPlan 2)) AND ((NOT (SubPlan 3)) OR (NOT (SubPlan 6))))
         SubPlan 1
           ->  Aggregate  (cost=9.62..9.63 rows=1 width=0)
                 ->  Index Only Scan using depends_dependency_idx on depends  (cost=0.00..9.47 rows=60 width=0)
                       Index Cond: (dependency = public.ports.id)
         SubPlan 2
           ->  Index Only Scan using locks_port_key on locks  (cost=0.00..8.27 rows=1 width=4)
                 Index Cond: (port = public.ports.id)
         SubPlan 3
           ->  Index Only Scan using depends_pkey on depends  (cost=0.00..8.72 rows=14 width=4)
                 Index Cond: (port = public.ports.id)
         SubPlan 6
           ->  Seq Scan on depends  (cost=8.27..6399946.81 rows=1150079 width=4)
                 Filter: ((NOT (SubPlan 4)) OR (hashed SubPlan 5))
                 SubPlan 4
                   ->  Index Scan using ports_pkey on ports  (cost=0.00..8.31 rows=1 width=4)
                         Index Cond: (id = public.depends.dependency)
                         Filter: (((status)::text = 'pass'::text) OR ((status)::text = 'warn'::text))
                 SubPlan 5
                   ->  Index Only Scan using locks_port_key on locks  (cost=0.00..8.27 rows=1 width=4)
                         Index Cond: (port = public.ports.id)

您可以尝试使用NOT EXISTS和反连接版本,因为NOT IN不能使用很多索引(因为NULL处理问题):

SELECT *
FROM table1
WHERE table1.id NOT IN (SELECT id FROM table2)

-- vs NOT EXISTS

SELECT *
FROM table1
WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.id = table2.id)

-- vs anti-join

SELECT *
FROM table1
LEFT JOIN table2 ON table1.id = table2.id
WHERE table2.id IS NULL

我结束了使用连接和不存在查询的组合来获得最终工作查询。

create VIEW ready_ports AS 
SELECT ports.id AS id, 
       ports.run AS run,
       ports.name AS name, 
       ports.pkgname AS pkgname,
       ports.version AS version,
       ports.description AS description,
       ports.license AS license,
       ports.www AS www, 
       ports.status AS status, 
       ports.updated AS updated,
      (SELECT count(0) AS COUNT 
       FROM depends 
       WHERE depends.dependency = ports.id) AS priority 
FROM ports 
LEFT JOIN locks on locks.port = ports.id
LEFT JOIN depends on depends.port = ports.id
WHERE ports.status = 'untested' and locks.id is null and 
      (depends.port is null or 
         not exists
              (SELECT depends.port AS port 
               FROM depends WHERE ports.id = depends.port and not exists              
                 (SELECT ports.id as dep_id
                  FROM ports   
                  WHERE ports.id = depends.dependency and 
                  (ports.status = 'pass' or ports.status = 'warn'))
                  or 
depends.dependency = locks.port))
ORDER BY priority desc, ports.name asc

您需要使用连接重写查询:

select ...
from ports
left join locks ...
left join depends ...
where criteria

那样的话,你将会制作一套大套装,这是三套的结果,而不是半套套装。

将计数从您的视图中移出也是一个好的选择。 使用单独的查询或加入您的视图来获取该部分。 (视图中的聚合很少是一个好主意,除了在报告中。)

链接地址: http://www.djcxy.com/p/86073.html

上一篇: Improving query performance in PostgreSQL with nested not in

下一篇: Postgres slow query (slow index scan)