Find the one matching set in a number of sets
I have a table (@t1) that contains a number of sets. I want to find a perfect match for @t2 in @t1.
In this example, the desired result is 1.
(Set 1 matches perfectly, set 2 contains three elements while @t2 only contains two elements, set 3 contains less elements than @t2, set 4 contains NULL elements which are not allowed in @t2 and set 5 contains the right number of elements but one of the elements is not equal.)
DECLARE @t1 TABLE (id INT, data INT);
DECLARE @t2 TABLE (data INT PRIMARY KEY);
INSERT INTO @t1 (id, data)
VALUES
(1, 1),
(1, 2),
(2, 1),
(2, 2),
(2, 3),
(3, 1),
(4, NULL),
(4, NULL),
(5, 1),
(5, 3);
INSERT @t2 (data)
VALUES
(1),
(2);
I have a query that may be getting the job done, but it looks somehow wretched too me.
WITH t1 AS
(
SELECT id, data
FROM @t1
WHERE data IS NOT NULL
),
t1_count AS
(
SELECT id, RCount = COUNT(*)
FROM @t1
WHERE data IS NOT NULL
GROUP BY id
)
SELECT t1.id
FROM t1
JOIN t1_count ON t1.id = t1_count.id
FULL JOIN @t2 t2 ON t1.data = t2.data
WHERE t1_count.RCount = (SELECT RCount = COUNT(*) FROM @t2)
GROUP BY t1.id
HAVING COUNT(t1.data) = COUNT(t2.data);
EDIT (GarethD's comment):
WITH t1 AS
(
SELECT
id,
data,
RCount = COUNT(*) OVER(PARTITION BY id)
FROM @t1
WHERE data IS NOT NULL
)
SELECT t1.id
FROM t1
FULL JOIN @t2 t2 ON t1.data = t2.data
WHERE t1.RCount = (SELECT RCount = COUNT(*) FROM @t2)
GROUP BY t1.id
HAVING COUNT(t1.data) = COUNT(t2.data);
What you want is called Exact Relational Division . Unfortunately, SQL Server has no native operator for this, but it is a well-documented problem. One possible solution (idea taken from an article by Joe Celko) is to compare counts, similar to what you are already doing:
SELECT t1.id
FROM @t1 AS t1 LEFT JOIN @t2 AS t2 ON t1.data = t2.data
GROUP BY t1.id
HAVING COUNT(t1.data) = (SELECT COUNT(data) FROM @t2)
AND COUNT(t2.data) = (SELECT COUNT(data) FROM @t2);
Note that both HAVING
comparisons are necessary:
t2.data
would be NULL via the LEFT JOIN. Recall that COUNT(x) only counts non-null values of x). One way to solve this is to concatenate the values for each ID
and the values in the second table and compare them. You can apply ordering of the concatenation as well.
For example, the following code will concatenate the values from the second table:
DECLARE @Test VARCHAR(MAX) = (
SELECT ',' + data
FROM @t2
ORDER BY data
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
SELECT @test -- 12
and the following will do this for the first table:
SELECT id
,(
SELECT ',' + data
FROM @t1
WHERE id = t1.id
ORDER BY data
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
FROM @t1 t1
GROUP BY id
You can filter the values easily with where
clause:
SELECT *
FROM
(
SELECT id
,(
SELECT ',' + data
FROM @t1
WHERE id = t1.id
ORDER BY data
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)') data
FROM @t1 t1
GROUP BY id
) DS
WHERE data = @test
Concatenating values in T-SQL
does not look nice
, but you can apply an aggregation concatenation function using SQL CLR
(check this article) and you will have something like this then:
DECLARE @Test VARCHAR(MAX);
SELECT @Test = [dbo].[Concatenate] (data)
FROM @t2
SELECT id
FROM @t1
GROUP BY id
HAVING [dbo].[Concatenate] (data) = @Test
Maybe this one seems simpler for you?
SELECT dat.id
FROM (
SELECT t1.id as id, t2.data as data2, sourcecount.cnt as scnt, dataCount.cnt as dcnt, COUNT(*) OVER(PARTITION BY t1.id) as mcnt
FROM @t1 as t1
INNER JOIN (SELECT t.id, COUNT(*) as cnt FROM @t1 as t GROUP BY t.id) as sourceCount
ON t1.id = sourceCount.id
INNER JOIN @t2 as t2
ON t1.data = t2.data
CROSS JOIN (SELECT COUNT(*) as cnt FROM @t2) as dataCount
) as dat
WHERE dat.scnt = dat.dcnt AND dat.mcnt = dat.dcnt
GROUP BY dat.id
Which produces by the way the same execution plan as yours, but is maybe a bit more readable.
Best regards, Ionic
链接地址: http://www.djcxy.com/p/85672.html上一篇: 如何使用Drive.API从Google Drive下载文件?
下一篇: 找到多个组中的一个匹配组