将项目从 pandas 系列转换为日期时间

问题描述:

我有一个Pandas系列( timeSeries),其中包含一天中的某个时间。一些项目为空白,一些为实际时间(08:00; 13:00),一些为时间指示(早晨,下午)。

I have a Pandas Series ("timeSeries") that includes a time of day. Some of the items are blank, some are actual times (08:00; 13:00), some are indications of time (morning, early afternoon).

由于我所在的时间是纽约,所以我想将时间格式的项目转换为伦敦时间。当我还添加了 timedelta(hours = 5)时,使用 pd.to_datetime(timeSeries,error ='ignore')不起作用。所以我试图添加一个if条件,但是它不起作用。

As the time of day I have is New York, I would like to convert the items in the time format to London time. Using pd.to_datetime(timeSeries, error='ignore') does not work when I also have the addition of timedelta(hours=5). So I attempted to add a if condition but it does not work.

样本初始数据帧:

dfNY = pd.DataFrame({'TimeSeries': [13:00, nan, 06:00, 'Morning', 'Afternoon', nan, nan, 01:30])

所需结果:

dfLondon = pd.DataFrame({'TimeSeries': [18:00, nan, 11:00, 'Morning', 'Afternoon', nan, nan, 06:30])

对我的代码的任何帮助或简化都会很棒。

Any help or simplification of my code would be great.

london = dt.datetime.now(timezone("America/New_York"))
newYork = dt.datetime.now(timezone("Europe/London"))
timeDiff = (london - dt.timedelta(hours = newYork.hour)).hour

for dayTime in timeSeries: 
     if dayTime == "%%:%%": 
        print(dayTime)
        dayTime = pd.to_datetime(dayTime) + dt.timedelta(hours=timeDiff)
return timeSeries

更新:使用pytz方法下面的评论产生的时区偏离了我的5分钟。我们该如何解决呢?

Update: using pytz method in comment below yields a timezone that is off my 5min. How do we fix this?

使用 .dt 访问器,您可以将时区设置为您的值,并且而不是使用 tz.localize tz_convert

Using the .dt accessor, you can set a timezone to your value, and than convert it to another one, using tz.localize and tz_convert.

import pandas as pd
import numpy as np

pd.options.display.max_columns = 5

df = pd.DataFrame({'TimeSeries': ["13:00", np.nan, "06:00", 'Morning', 'Afternoon', np.nan, np.nan, "01:30"]})

#   Convert your data to datetime, errors appears, but we do not care about them.
#   We also explicitly note that the datetime is a specific timezone.
df['TimeSeries_TZ'] = pd.to_datetime(df['TimeSeries'], errors='coerce', format='%H:%M')\
                     .dt.tz_localize('America/New_York')
print(df['TimeSeries_TZ'])
# 0   1900-01-01 13:00:00-04:56
# 1                         NaT
# 2   1900-01-01 06:00:00-04:56
# 3                         NaT
# 4                         NaT
# 5                         NaT
# 6                         NaT
# 7   1900-01-01 01:30:00-04:56

#   Then, we can use the datetime accessor to convert the timezone.
df['Converted_time'] = df['TimeSeries_TZ'].dt.tz_convert('Europe/London').dt.strftime('%H:%M')
print(df['Converted_time'])
# 0    17:55
# 1      NaT
# 2    10:55
# 3      NaT
# 4      NaT
# 5      NaT
# 6      NaT
# 7    06:25

#   If you want to convert the original result that CAN be converted, while keeping the values that
#   raised errors, you can copy the original data, and change the data that is not equal to the value
#   that means an error was raised, e.g : NaT (not a timestamp).
df['TimeSeries_result'] = df['TimeSeries'].copy()
df['TimeSeries_result'] = df['TimeSeries'].where(~df['Converted_time'].ne('NaT'), df['Converted_time'])


print(df[['TimeSeries', 'TimeSeries_result']])
#   TimeSeries TimeSeries_result
# 0      13:00             17:55
# 1        NaN               NaN
# 2      06:00             10:55
# 3    Morning           Morning
# 4  Afternoon         Afternoon
# 5        NaN               NaN
# 6        NaN               NaN
# 7      01:30             06:256          06:25             06:25