Create a query to fill the gaps in a table due to bad data
I have a table with the following schema:
DateTime [Creation Date] PK
int [Data]
The column data has values coming from a sensor, something like this:
123
225
354
578
0
2151
2331
0
2555
2678
As you can see the value always increments.
Due to a problem in the sensor we get from time to time a 0 between valid values. This is producing us several problems when we try to use the data so we want to fill these 0 gaps with something. Ideally we would like to put an average between the previous and the following value, if this is not possible we want to repeat the previous value.
Is this something doable with just a query?
Thanks in advance.
Maybe not the most efficient one, but should work:
WITH cte
AS (SELECT [Creation Date],
Data,
rn=Row_number() OVER(ORDER BY [Creation Date])
FROM dbo.Table)
UPDATE cte
SET Data = ( ( (SELECT c2.Data
FROM cte c2
WHERE c2.rn = cte.rn - 1)
+ (SELECT c2.Data
FROM cte c2
WHERE c2.rn = cte.rn + 1) ) / 2 )
WHERE Data = 0;
I use Row_Number
in a CTE to get consecutive numbers ordered by Creation Date
. Then this number is used to get the new data according to its previous and next value.
Here's a demo with similar schema(i have used an int
instead of the datetime
):
Update
Nice one but it does not handle gaps with multiple 0
Good catch, here's the modified sql which takes that into account:
WITH cte
AS (SELECT [Creation Date],
Data,
rn=Row_number() OVER(ORDER BY [Creation Date])
FROM dbo.Table)
UPDATE cte
SET Data = ( ( (SELECT c2.Data
FROM cte c2
WHERE c2.rn = (SELECT MAX(RN)FROM CTE c3 WHERE c3.RN<cte.RN AND c3.Data<>0))
+ (SELECT c2.Data
FROM cte c2
WHERE c2.rn = (SELECT MIN(RN)FROM CTE c3 WHERE c3.RN>cte.RN AND c3.Data<>0))) / 2 )
WHERE Data = 0;
Demo (with consecutive zeros on 5,6)
我有另一个变化:
SELECT
BadDate,
T1.Data AS PrevData,
T2.Data AS NextData,
(T1.Data + T2.Data) / 2 AS AvgValue
FROM
(
SELECT
T1.CreationDate As BadDate,
Max(T2.CreationDate) As PrevDate,
Min(T3.CreationDate) As NextDate
FROM
TestData T1,
TestData T2,
TestData T3
WHERE
T1.Data = 0
AND T2.Data <> 0
AND T2.CreationDate < T1.CreationDate
AND T3.Data <> 0
AND T3.CreationDate > T1.CreationDate
GROUP BY T1.CreationDate
) DateData
INNER JOIN TestData T1
ON DateData.PrevDate = T1.CreationDate
INNER JOIN TestData T2
ON DateData.NextDate = T2.CreationDate
If you are not worried about getting average, this method can add a number to the previous value.
Also please note that I am not sure if this method has any issues (other than updating all records) but showing merely as a different and simple approach...
declare @new int = 1
update mytable
set @new = val = case when val = 0 then @new + 1 else val end
Fiddle demo
| D | VAL |
---------------------
| 2013-01-01 | 123 |
| 2013-01-02 | 225 |
| 2013-01-03 | 354 |
| 2013-01-04 | 578 |
| 2013-01-05 | 579 |--Updated
| 2013-01-06 | 2151 |
| 2013-01-07 | 2331 |
| 2013-01-08 | 2332 |--Updated
| 2013-01-09 | 2555 |
| 2013-01-10 | 2678 |
链接地址: http://www.djcxy.com/p/69620.html