lagging in data.table R
Currently I have a utility function that lags
things in data.table
by group. The function is simple:
panel_lag <- function(var, k) {
if (k > 0) {
# Bring past values forward k times
return(c(rep(NA, k), head(var, -k)))
} else {
# Bring future values backward
return(c(tail(var, k), rep(NA, -k)))
}
}
I can then call this from a data.table
:
x = data.table(a=1:10,
dte=sample(seq.Date(from=as.Date("2012-01-20"),
to=as.Date("2012-01-30"), by=1),
10))
x[, L1_a:=panel_lag(a, 1)] # This won't work correctly as `x` isn't keyed by date
setkey(x, dte)
x[, L1_a:=panel_lag(a, 1)] # This will
This requires that I check inside panel_lag
whether x
is keyed. Is there a better way to do lagging? The tables tend to be large so they should really be keyed. I just do setkey
before i lag. I would like to make sure I don't forget to key them. So I would like to know if there is a standard way people do this.
If you want to ensure that you lag in order of some other column, you could use the order
function:
x[order(dte),L1_a:=panel_lag(a,1)]
Though if you're doing a lot of things in date order it would make sense to key it that way.
链接地址: http://www.djcxy.com/p/68396.html上一篇: D开发过程
下一篇: 滞后于data.table R