如果上个月是一月,则减去一年

问题描述:

所有处理之后,我可以创建以下数据框.唯一的问题是年份不正确.每个位置的日期按降序排列.因此,在2015-01-15之后应该是2014-12-15,而不是2015-12-15.

After all the processing I was able to create below dataframe. The only problem with it is that the year is incorrect. The date is in decreasing order for each Location. So after 2015-01-15 it should be 2014-12-15, not 2015-12-15.

+--------------------+---------------+-------+
|   Location         | Date          | Value |
+--------------------+---------------+-------+
| India              | 2015-03-15    |   -200|
| India              | 2015-02-15    |  140  |
| India              | 2015-01-15    |  155  |
| India              | 2015-12-15    |   85  |
| India              | 2015-11-15    |   45  |
| China              | 2015-03-15    |   199 |
| China              | 2015-02-15    |  164  |
| China              | 2015-01-15    |  209  |
| China              | 2015-12-15    |   24  |
| China              | 2015-11-15    |   11  |
| Russia             | 2015-03-15    |   48  |
| Russia             | 2015-02-15    |  104  |
| Russia             | 2015-01-15    |  106  |
| Russia             | 2015-12-15    |   -20 |
| Russia             | 2015-11-15    |   10  |

进行 strong 假设,这些日期是每个月的15日结束的每月日期,并且给定是正确的,我们可以每月按Location向后退.

Making the strong assumption that these are monthly dates ending on the 15th of every month and that the first value for a given Location is correct, we can step backwards monthly by Location.

# Create original dataframe.
df = pd.DataFrame({'Location': ['India'] * 5 + ['China'] * 5 + ['Russia'] * 5,
                   'Date': ['2015-03-15', '2015-02-15', '2015-01-15', '2015-12-15', '2015-11-15'] * 3,
                   'Value': [-200, 140, 155, 85, 45, 199, 164, 209, 24, 11, 48, 104, 106, -20, 10]})[
    ['Location', 'Date', 'Value']
]
# Convert dates to pandas Timestamps.
df['Date'] = pd.DatetimeIndex(df['Date'])

gb = df.groupby(['Location'])['Date']
df['Date'] = [
    str(first_period - months) + '-15'
     for location_months, first_period in zip(
         gb.count(), gb.first().apply(lambda date: pd.Period(date, 'M'))) 
     for months in range(location_months)
]
>>> df
   Location        Date  Value
0     India  2015-03-15   -200
1     India  2015-02-15    140
2     India  2015-01-15    155
3     India  2014-12-15     85
4     India  2014-11-15     45
5     China  2015-03-15    199
6     China  2015-02-15    164
7     China  2015-01-15    209
8     China  2014-12-15     24
9     China  2014-11-15     11
10   Russia  2015-03-15     48
11   Russia  2015-02-15    104
12   Russia  2015-01-15    106
13   Russia  2014-12-15    -20
14   Russia  2014-11-15     10

最终日期为字符串形式,您可能希望再次通过以下方式将其转换回时间戳记:

The final dates are in string form which you may again wish to convert back to Timestamps via:

df['Date'] = pd.DatetimeIndex(df['Date'])