Oracle SQL:检测连续跨度的中断

我有下面的表格,我试图检测其跨度中断的产品。

Product     | unit_Cost | price start date |    price end date
--------------------------------------------------------------------------
product 1     15.00         01/01/2011      03/31/2011
product 1     15.00         04/01/2011      06/31/2011
product 1     15.00         07/01/2011      09/31/2011
product 1     15.00         10/01/2011      12/31/2011

product 2     10.00         01/01/2011      12/31/2011

product 3     25.00         01/01/2011      06/31/2011
product 3     25.00         10/01/2011      12/31/2011

所以在这里,我希望它报告产品3,因为我们缺少跨度

07/01/2011 - 09/31/2011

关于如何做到这一点的任何想法?

编辑:Oracle Ver:10g

Create Table Statement

CREATE TABLE Sandbox.TBL_PRODUCT
(
  PRODUCT_ID        VARCHAR2(13 BYTE),   
  PRODUCT           VARCHAR2(64 BYTE),
  UNIT_COST         NUMBER,
  PRICE_START_DATE  DATE,
  PRICE_END_DATE    DATE
)

编辑2开始日期和结束日期不能重叠

编辑3跨度可以是任何两个日期,只要price_end_date> = price_start_date。 由于产品可以在一天内出售,因此包含Equal。


试试这个(使用LEAD分析函数):

SELECT *
  FROM (
                SELECT a.*, LEAD(price_start_date,1,NULL) OVER(PARTITION BY product ORDER BY price_end_date) next_start_date 
         FROM Product a
       )
WHERE (price_end_date + 1)<> next_start_date

安装示例

        CREATE TABLE PRODUCT
          (
            PRODUCT   VARCHAR2(100 BYTE),
            UNIT_COST NUMBER,
            START_DATE DATE,
            END_DATE DATE
          );

        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('01/01/2011','MM/DD/RRRR'),TO_DATE('03/31/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('04/01/2011','MM/DD/RRRR'),TO_DATE('06/30/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('07/01/2011','MM/DD/RRRR'),TO_DATE('09/30/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 1','15.00',TO_DATE('10/01/2011','MM/DD/RRRR'),TO_DATE('12/31/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 2','10.00',TO_DATE('01/01/2011','MM/DD/RRRR'),TO_DATE('12/31/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 3','25.00',TO_DATE('01/01/2011','MM/DD/RRRR'),TO_DATE('06/30/2011','MM/DD/RRRR'));
        INSERT INTO Product VALUES('product 3','25.00',TO_DATE('10/01/2011','MM/DD/RRRR'),TO_DATE('12/31/2011','MM/DD/RRRR'));

SELECT *
  FROM (
                SELECT a.*, LEAD(start_date,1,NULL) OVER(PARTITION BY product ORDER BY start_date) next_start_date 
                 FROM Product a
              )
WHERE (end_date + 1)<> next_start_date

编辑 :更新查询以考虑下一个start_date和当前end_date以避免数据分布问题。


你也可以使用这种技术。 它使用内部查询( chronological_record )为TBL_PRODUCT表中的每条记录(每个product start_date上排序的排名)分配一个排名。

WITH
  chronological_record AS
  (
    SELECT
      product,
      unit_cost,
      start_date,
      end_date,
      (DENSE_RANK() OVER (PARTITION BY product ORDER BY start_date))
          AS chronological_order
    FROM
      TBL_PRODUCT
  )

SELECT
  earlier.product,
  (earlier.end_date + 1) AS missing_period_start_date,
  (later.start_date - 1) as missing_period_end_date
FROM
  CHRONOLOGICAL_RECORD earlier
  INNER JOIN
  CHRONOLOGICAL_RECORD later
    ON
        earlier.product = later.product
      AND
        (earlier.chronological_order + 1) = later.chronological_order
WHERE
  (earlier.end_date + 1) <> later.start_date

在你的例子中,子查询( chronological_record )会产生这样的结果:

Product   | unit_Cost | start date | end date   | chronological_order
--------------------------------------------------------------------------
product 1    15.00      01/01/2011   03/31/2011    1
product 1    15.00      04/01/2011   06/31/2011    2
product 1    15.00      07/01/2011   09/31/2011    3
product 1    15.00      10/01/2011   12/31/2011    4

product 2    10.00      01/01/2011   12/31/2011    1

product 3    25.00      01/01/2011   06/31/2011    1
product 3    25.00      10/01/2011   12/31/2011    2

主要查询的INNER JOIN有效地匹配较早的记录和它们的下一个(按时间顺序)记录。


假设您的表称为products ,您的开始日期列命名为s ,您的结束日期列命名为e

create view max_interval as 
select product, 
max(e) - min(s) as max_interval 
from products group by product;


create view total_days as 
select product, 
sum( e - s ) + count(product) - 1 as total_days 
from products group by product  ;

然后这个查询为您提供所有“缺失”跨度的产品:

select a.*, b.*
from max_interval a 
left outer join total_days b 
on (a.product = b.product)
where a.max_interval <> b.total_days;

由于在两个视图中group by是相同的,因此这当然可以组合成单个查询,尽管使解决方案稍微不太清晰:

select product, 
max(e) - min(s) as max_interval, 
sum( e - s ) + count(product) - 1 as total_days 
from products group by product  
having max(e) - min(s) <> sum( e - s ) + count(product) - 1;

但正如Stephanie Page指出的那样,这是一个不成熟的优化; 您不太可能会频繁地扫描连续跨度的中断。

链接地址: http://www.djcxy.com/p/50267.html

上一篇: Oracle SQL: Detecting breaks in continual spans

下一篇: glsl