更改Pandas Dataframe中的时间频率
我有一个熊猫DataFrame如下。
df
A B
date_time
2014-07-01 06:03:59.614000 62.1250 NaN
2014-07-01 06:03:59.692000 62.2500 NaN
2014-07-01 06:13:34.524000 62.2500 241.0625
2014-07-01 06:13:34.602000 62.2500 241.5000
2014-07-01 06:15:05.399000 62.2500 241.3750
2014-07-01 06:15:05.399000 62.2500 241.2500
2014-07-01 06:15:42.004000 62.2375 241.2500
2014-07-01 06:15:42.082000 62.2375 241.3750
2014-07-01 06:15:42.082000 62.2375 240.2500
我想将此频率更改为定期1 minute
间隔。 但是得到下面的错误:
new = df.asfreq('1Min')
>>error: cannot reindex from a duplicate axis
现在,我明白为什么会发生这种情况。 由于我的时间粒度很高(以毫秒为单位),但不规则,因此每分钟甚至每秒都会得到多个读数。 所以我试图将这些毫秒读数与分钟结合起来,并摆脱下面的重复。
# try to convert the index to minutes and drop duplicates
df['index'] = df.index
df['minute_index']= df['index'].apply( lambda x: x.strftime('%Y-%m-%d %H:%M'))
df.drop_duplicates(cols = 'minute_index', inplace = True, take_last = True)
df_by_minute = df.set_index('minute_index')
df_by_minute
A B index
minute_index
2014-07-01 06:03 62.2500 NaN 2014-07-01 06:03:59.692000
2014-07-01 06:13 62.2500 241.50 2014-07-01 06:13:34.602000
2014-07-01 06:15 62.2375 240.25 2014-07-01 06:15:42.082000
# now change the frequency to 1 minute but I just get NaNs (!)
df_by_minute.asfreq('1Min')
A B index
2014-07-01 06:03:00 NaN NaN NaT
2014-07-01 06:04:00 NaN NaN NaT
2014-07-01 06:05:00 NaN NaN NaT
2014-07-01 06:06:00 NaN NaN NaT
2014-07-01 06:07:00 NaN NaN NaT
2014-07-01 06:08:00 NaN NaN NaT
2014-07-01 06:09:00 NaN NaN NaT
2014-07-01 06:10:00 NaN NaN NaT
2014-07-01 06:11:00 NaN NaN NaT
2014-07-01 06:12:00 NaN NaN NaT
2014-07-01 06:13:00 NaN NaN NaT
2014-07-01 06:14:00 NaN NaN NaT
2014-07-01 06:15:00 NaN NaN NaT
正如你看到它不起作用..有人可以帮忙吗? 我试图实现的是获得一个返回A or B as of DateTime
的函数, A or B as of DateTime
并且DateTime将以1Min为增量。
我认为,不是asfreq
但resample
适合您的需求:
new = df.resample('T', how='mean')
how
选择,你也可以使用'最后'或'第一'。
上一篇: Changing time frequency in Pandas Dataframe
下一篇: How to integrate Spring Data Neo4j and Mongodb in Grails 2.4.2 Project