pandas 数据框条件更改
我正在处理csv时间序列数据,该数据显示了每个时间范围内的步数。一旦步数超过65535,它将从0开始计数,依此类推。但是,由于并非所有数据集都具有65535计数(有些从65530开始,如果在时间范围内进行了多步,则从5开始),我无法找到一种好方法来处理它,使6553x之后的每个0都将变为65536。等等。
I'm working on csv time series data, which shows count of step per some time frame. Once the step count is exceeding 65535, it will count start from 0, etc. However since not all the dataset has 65535 count (some goes from 65530, then 5, if they made several steps on the time frame), I can't find a good way to handle it so that every 0 after 6553x will change to 65536.. etc.
step realstep
65531 65531
65533 65533
65534 65534
2 65538
4 65540
我正在尝试计算实际步长,以获取它们的差异(例如,步长/分钟)。
I'm trying to count the real step in order to get their difference (e.g step/minute).
找到复位位置,其中 diff
为负,然后将最大计数器值(由于从0开始计数,所以为65536)添加到超出该值的所有行。如果多次重置(我添加了一些额外的数据),这将很灵活
Find where it resets with diff
being negative and add the max counter value (65536 since you count from 0) to all rows beyond that. This will be flexible if it resets multiple times (I added some extra data)
df['real_step'] = df.step + df.step.diff(1).lt(0).cumsum()*65536
step real_step
0 65531 65531
1 65533 65533
2 65534 65534
3 2 65538
4 4 65540
5 65434 130970
6 2 131074
7 4 131076