dc.js and crossfilter second level aggregation to average count per hour
I am trying to slightly extend the problem described in this question:
dc.js and crossfilter reduce average counts per day of week
I would like to chart average counts per hour of the day. I have followed the solution above, counting the values by day in the custom reduce with the only change being to dimension by hour of day. This seems to work well and can be seen in the following fiddle:
http://jsfiddle.net/dolomite/6eeahs6z/73/
The top bar chart shows the average counts by hour, the lower chart the total counts by hour. So hour 22 has a total count of 47 and average count of 4.2727... There are 11 days in the data so this is correct.
However, when I click on the weekday row chart and filter for Sunday I get a total count for hour 22 of 4 and an average of 0.3636... The denominator in calculating the average values is still including all weekdays in the data, irrespective of the weekday I filter by. So while the total count has filtered to just show 4 for Sunday it is being divided by the total number of days in the data, whereas the requirement is just to divide by the number of whichever day/s have been selected in the filter.
I know the solution lies in modifying the custom reduce but I am stuck! Any pointers on where I am going wrong would be gratefully received.
hourAvgGroup = hourDim.group().reduce(
function (p, v) { // add
var day = d3.time.day(v.EventDate).getTime();
p.map.set(day, p.map.has(day) ? p.map.get(day) + 1 : 1);
p.avg = average_map(p.map);
return p;
},
function (p, v) { // remove
var day = d3.time.day(v.EventDate).getTime();
p.map.set(day, p.map.has(day) ? p.map.get(day) - 1 : 0);
p.avg = average_map(p.map);
return p;
},
function () { // init
return { map: d3.map(), avg: 0 };
}
)
function average_map(m) {
var sum = 0;
m.forEach(function(k, v) {
sum += v;
});
return m.size() ? sum / m.size() : 0;
}
m.size()
counts up the number of keys in the map. The problem is that even if a day has 0 records assigned to it, the key is still there, so m.size()
counts it in the denominator. The solution is to remove the key when the count gets to 0. There are probably more efficient ways to do this, but the simplest solution is to add one line to your remove function in the custom reducer so that the function looks like this:
function (p, v) { // remove
var day = d3.time.day(v.EventDate).getTime();
p.map.set(day, p.map.has(day) ? p.map.get(day) - 1 : 0);
// If the day has 0 records, remove the key
if(p.map.has(day) && p.map.get(day) == 0) p.map.remove(day);
p.avg = average_map(p.map);
return p;
},
By the way, I would also recommend not including the actual average and average calculation in your group. Calculate it in the dc.js chart valueAccessor
instead. The reducer is run once for every record added or removed. The valueAccessor
is only run once per filter operation.