PostgreSQL: Terribly slow ORDER BY with primary key as the ordering key

I have a model like this

在这里输入图像描述

with the following table sizes:

+------------------+-------------+
| Table            |    Records  |
+------------------+-------------+
| JOB              |         8k  |
| DOCUMENT         |       150k  |
| TRANSLATION_UNIT |      14,5m  |
| TRANSLATION      |      18,3m  |
+------------------+-------------+

Now the following query

select translation.id
from "TRANSLATION" translation
   inner join "TRANSLATION_UNIT" unit
     on translation.fk_id_translation_unit = unit.id
   inner join "DOCUMENT" document
     on unit.fk_id_document = document.id     
where document.fk_id_job = 11698
order by translation.id asc
limit 50 offset 0

takes about 90 seconds to finish. When I remove the ORDER BY and LIMIT clauses, it takes 19.5 seconds . ANALYZE had been run on all tables just before executing the query.

For this particular query, these are the numbers of records satisfying the criteria:

+------------------+-------------+
| Table            |     Records |
+------------------+-------------+
| JOB              |          1  |
| DOCUMENT         |       1200  |
| TRANSLATION_UNIT |    210,000  |
| TRANSLATION      |    210,000  |
+------------------+-------------+

The query plan:

在这里输入图像描述

The query plan for the modification without ORDER BY and LIMIT is here.

Database parameters:

PostgreSQL 9.2

shared_buffers = 2048MB
effective_cache_size = 4096MB
work_mem = 32MB

Total memory: 32GB
CPU: Intel Xeon X3470 @ 2.93 GHz, 8MB cache

Can anyone see what is wrong with this query?

UPDATE : Query plan for the same query without ORDER BY (but still with the LIMIT clause).


This is a bit too long for a comment. You are comparing apples and oranges when you remove the order by clause. Without the order by , the processing part of the query only needs to come up with 50 rows.

With the order by , all the rows need to be generated before they are sorted and the top few chosen. How long does the query take if you remove the order by and the limit clause?

The fact that translation.id is a primary key does not make a difference, because the processing requires going through several joins (which filter the results).

EDIT:

I wonder how this would work with a CTE to first create the table and then another to sort and fetch the results:

with CTE as (
     select translation.id
     from "TRANSLATION" translation
          inner join "TRANSLATION_UNIT" unit
          on translation.fk_id_translation_unit = unit.id
          inner join "DOCUMENT" document
          on unit.fk_id_document = document.id     
     where document.fk_id_job = 11698
    )
select *
from CTE
order by translation.id asc
limit 50 offset 0;

Do you have a composite index in place on translation(fk_id_translation_unit, id)? It seems to me that that would help by avoiding the need to access the translation.id via the table.

链接地址: http://www.djcxy.com/p/17030.html

上一篇: 如何使用DRF的ModelSerializer创建一个django用户

下一篇: PostgreSQL:使用主键作为排序键非常慢的ORDER BY