Update statement causes fields to be updated with NULL or maximum value
If you had to pick one of the two following queries, which would you choose and why:
UPDATE `table1` AS e
SET e.points = e.points+(
SELECT points FROM `table2` AS ep WHERE e.cardnbr=ep.cardnbr);
or:
UPDATE `table1` AS e
INNER JOIN
(
SELECT points, cardnbr
FROM `table2`
) AS ep ON (e.cardnbr=ep.cardnbr)
SET e.points = e.points+ep.points;
Tables' definitions:
CREATE TABLE `table1` (
`cardnbr` int(10) DEFAULT NULL,
`name` varchar(50) DEFAULT NULL,
`points` decimal(7,3) DEFAULT '0.000',
`email` varchar(50) NOT NULL DEFAULT 'user@company.com',
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=25205 DEFAULT CHARSET=latin1$$
CREATE TABLE `table2` (
`cardnbr` int(10) DEFAULT NULL,
`id` int(11) NOT NULL AUTO_INCREMENT,
`points` decimal(7,3) DEFAULT '0.000',
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci$$
UPDATE: BOTH are causing problems the first is causing non matched rows to update into NULL. The second is causing them to update into the max value 999.9999 (decimal 7,3). PS the cardnbr field is NOT a key
First, the two statements are not equivalent, as you found out yourself. The first one will update all rows of table1
, putting NULL
values for those rows that have no related rows in table2
.
So the second query looks better because it doesn't update all rows of table1
. It could be written in a more simpel way, like this though:
UPDATE table1 AS e
INNER JOIN table2 AS ep
ON e.cardnbr = ep.cardnbr
SET e.points = e.points + ep.points ;
So, the 2nd query would be the best to use, if cardnbr
was the primary key of table2
. Is it?
If it isn't, then which values from table2 should be used for the update of table1 (added to points
)? All of them? You could use this:
UPDATE table1 AS e
INNER JOIN
( SELECT SUM(points) AS points, cardnbr
FROM table2
GROUP BY cardnbr
) AS ep ON e.cardnbr = ep.cardnbr
SET
e.points = e.points + ep.points ;
Just one of them? That would require some other derived table, depending on what you want.
I prefer the second one..reason for that is
When using JOIN
the databse can create an execution plan that is better for your query and save time whereas subqueries (like your first one ) will run all the queries and load all the datas which may take time.
i think subqueries is easy to read but performance wise JOIN
is faster...
上一篇: R中的多核和data.table
下一篇: 更新语句会使字段更新为NULL或最大值