Arel causing infinite loop on aggregation
I have trouble with using Arel to aggregate 2 columns in the same query. When I run this, the whole server freezes for a minute, before the rails dev-server crashes. I suspect an infinite loop :).
Maybe I have misunderstood the whole concept of Arel, and I would be grateful if anybody could have a look at it.
The expected result of this query is something like this: [{:user_id => 1, :sum_account_charges => 300, :sum_paid_debts => 1000},...]
a_account_charges = Table(:account_charges)
a_paid_debts = Table(:paid_debts)
a_participants = Table(:expense_accounts_users)
account_charge_sum = a_account_charges
.where(a_account_charges[:expense_account_id].eq(id))
.group(a_account_charges[:user_id])
.project(a_account_charges[:user_id], a_account_charges[:cost].sum)
paid_debts_sum = a_paid_debts
.where(a_paid_debts[:expense_account_id].eq(id))
.group(a_paid_debts[:from_user_id])
.project(a_paid_debts[:from_user_id], a_paid_debts[:cost].sum)
charges = a_participants
.where(a_participants[:expense_account_id].eq(id))
.join(account_charge_sum)
.on(a_participants[:user_id].eq(account_charge_sum[:user_id]))
.join(paid_debts_sum)
.on(a_participants[:user_id].eq(paid_debts_sum[:from_user_id]))
I'm new to arel, but after banging on this for several days and really digging, I don't think it can be done. Here's an outline of what I've done, if anyone has any additional insight it would be welcome.
First, these scripts will create the test tables and populate them with test data. I have set up 9 expense_account_users, each with a different set of charges/paid_debts, as follows: 1 charge/1 payment, 2 charges/2 payments, 2 charges/1 payment, 2 charges/0 payments, 1 charge/2 payments, 0 charges/2 payments, 1 charge/0 payments, 0 charges/1 payment, 0 charges, 0 payments.
CREATE TABLE IF NOT EXISTS `expense_accounts_users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`expense_account_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=10 ;
INSERT INTO `expense_accounts_users` (`id`, `expense_account_id`) VALUES (1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1);
CREATE TABLE IF NOT EXISTS `account_charges` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`expense_account_id` int(11) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`cost` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=10 ;
INSERT INTO `account_charges` (`id`, `expense_account_id`, `user_id`, `cost`) VALUES (1, 1, 1, 1), (2, 1, 2, 1), (3, 1, 2, 2), (4, 1, 3, 1), (5, 1, 3, 2), (6, 1, 4, 1), (7, 1, 5, 1), (8, 1, 5, 2), (9, 1, 7, 1);
CREATE TABLE IF NOT EXISTS `paid_debts` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`expense_account_id` int(11) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`cost` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=10 ;
INSERT INTO `paid_debts` (`id`, `expense_account_id`, `user_id`, `cost`) VALUES (1, 1, 1, 1), (2, 1, 2, 1), (3, 1, 2, 2), (4, 1, 3, 1), (5, 1, 4, 1), (6, 1, 4, 2), (7, 1, 6, 1), (8, 1, 6, 2), (9, 1, 8, 1);
Ultimately, to get the data that you are after in one fell swoop, this is the SQL statement you'd use:
SELECT user_charges.user_id,
user_charges.sum_cost,
COALESCE(SUM(paid_debts.cost), 0) AS 'sum_paid'
FROM (
SELECT expense_accounts_users.id AS 'user_id',
COALESCE(sum(account_charges.cost), 0) AS 'sum_cost'
FROM expense_accounts_users
LEFT OUTER JOIN account_charges on expense_accounts_users.id = account_charges.user_id
GROUP BY expense_accounts_users.id)
AS user_charges
LEFT OUTER JOIN paid_debts ON user_charges.user_id = paid_debts.user_id
GROUP BY user_charges.user_id
You have to do a LEFT OUTER JOIN between users and charges first so that you get a row for every user, then you have to LEFT OUTER JOIN the result to debts in order to avoid multiplying your results with the two joins inside the same construct.
(note the use of COALESCE to convert NULL values from the LEFT OUTER JOINs to zeroes - a convenience item, perhaps)
The result of this statement is this:
user_id sum_cost sum_paid
1 1 1
2 3 3
3 3 1
4 1 3
5 3 0
6 0 3
7 1 0
8 0 1
9 0 0
After many attempts, I found that this arel code came the closest to what we are after:
c = Arel::Table.new(:account_charges)
d = Arel::Table.new(:paid_debts)
p = Arel::Table.new(:expense_accounts_users)
user_charges = p
.where(p[:expense_account_id].eq(1))
.join(c, Arel::Nodes::OuterJoin)
.on(p[:id].eq(c[:user_id]))
.project(p[:id], c[:cost].sum.as('sum_cost'))
.group(p[:id])
charges = user_charges
.join(d, Arel::Nodes::OuterJoin)
.on(p[:id].eq(d[:user_id]))
.project(d[:cost].sum.as('sum_paid'))
Essentially, I am joining users to charges in the first construct with a LEFT OUTER JOIN, then trying to take the result of that and LEFT OUTER JOIN it back to debts. This arel code produces the following SQL statement:
SELECT `expense_accounts_users`.`id`,
SUM(`account_charges`.`cost`) AS sum_cost,
SUM(`paid_debts`.`cost`) AS sum_paid
FROM `expense_accounts_users`
LEFT OUTER JOIN `account_charges` ON `expense_accounts_users`.`id` = `account_charges`.`user_id`
LEFT OUTER JOIN `paid_debts` ON `expense_accounts_users`.`id` = `paid_debts`.`user_id`
WHERE `expense_accounts_users`.`expense_account_id` = 1
GROUP BY `expense_accounts_users`.`id`
Which, when run, produces this output:
id sum_cost sum_paid
1 1 1
2 6 6
3 3 2
4 2 3
5 3 NULL
6 NULL 3
7 1 NULL
8 NULL 1
9 NULL NULL
Very close, but not quite. First, the lack of COALESCE gives us NULL values instead of zeroes - I'm not sure how to effect the COALESCE function call from within arel.
More importantly, though, combining the LEFT OUTER JOINs into one statement without the internal subselect is resulting in the sum_paid totals getting multiplied up in examples 2, 3 and 4 - any time there is more than one of either charge or payment and at least one of the other.
Based on some online readings, I had hoped that changing the arel slightly would solve the problem:
charges = user_charges
.join(d, Arel::Nodes::OuterJoin)
.on(user_charges[:id].eq(d[:user_id]))
.project(d[:cost].sum.as('sum_paid'))
But any time I used user_charges[] in the second arel construct, I got an undefined method error for SelectManager#[] . This may be a bug, or may be right - I really can't tell.
I just don't see that arel has a way to make use of the SQL from the first construct as a queryable object in the second construct with the necessary subquery alias, as is needed to make this happen in one SQL statement.
链接地址: http://www.djcxy.com/p/48386.html上一篇: 什么是最有效的方式来处理“汞进口”拒绝?
下一篇: Arel在聚合上造成无限循环