Efficient way to plot area normalized bar chart using crossfilter / dc.js?

I am looking to plot a couple of histograms associated with continuous variables in a multidimensional dataset in dc.js. While this is rather easy to achieve with the dc.barChart component, I would like to have these histograms be area normalized. In my case, the bin widths are uniform, so this reduces to the problem of dividing the total counts in each bin/group by (binWidth * totCounts) .

I was able to plot an initial view for these histograms that was area normalized using something along the following lines:

var cf = crossfilter(data);
var totCounts = cf.groupAll().value();
var histDimension = cf.dimension(function(d) {
  return Math.floor(d.fieldOfInterest / binWidth) * binWidth;
});
var histGroup = histDimension.group().reduceSum(function(d) {
  return 1 / (binWidth * totCounts);
});

Coupling this approach with dc.js does result in an area normalized bar chart. However, when I start filtering data, the filtered data is not re-normalized. Instead the view always presents the data through the lens of the original area normalization on the unfiltered dataset.

While I understand why this is the case - The reducer functions in crossfilter are still using the initial normalization... I don't really understand if there is any plausible and performant way to achieve what I am looking for... namely, having the dc.js plot view always be re-normalized with respect to the filtered dataset. It seems to me that since the normalization for any single bin/group required information from across all groups (totCounts), that there is no incremental/performant way of defining reducing functions for efficient crossfiltering.

Am I missing some obvious way to achieve what I am looking to do or is this something I should abandon hope of being able to achieve in crossfilter/dc.js? I'd appreciate any inputs that might point me in the right direction.


What we need to do here is factor totCounts out of the reduce calculation, so that it can adapt to a changing total. Since crossfilter calculates reductions incrementally, there is no way for it to re-apply the total as it changes.

Luckily, the bar chart's valueAccessor is perfect for this. In fact, it's almost always a better choice for any reduction that involves division, since it's more efficient to do divisions as the values are read (once) versus while aggregation and reduction are happening (many times).

Here we just need a way of dynamically calculating the total, and this is what the groupAll is great for. In this case we probably want dimension.groupAll(), since it does not observe the current dimension's filters. We wouldn't want filtering on the current chart to cause it not to sum to one.

Putting these together:

var cf = crossfilter(data);
var histDimension = cf.dimension(function(d) {
  return Math.floor(d.fieldOfInterest / binWidth) * binWidth;
});
var totCounter = histDimension.groupAll();
var histGroup = histDimension.group(); // default reduceCount

barChart
    .valueAccessor(function(kv) {
        var total = totCounter.value();
        return total && (kv.value / (binWidth * total));
    })
链接地址: http://www.djcxy.com/p/32722.html

上一篇: DC.js numberDisplay与crossfilter获取总记录

下一篇: 使用crossfilter / dc.js绘制区域标准化条形图的高效方法?