如何将日期从文件名添加到时间列以创建datetime列?蟒蛇 pandas
我有多个这样命名的文件 2018-08-31-logfile-device1
2018-09-01-logfile-device1
I have multiple files that are named like this2018-08-31-logfile-device1
2018-09-01-logfile-device1
在这些文件中,数据的排序方式为: 00:00:00.283672模拟值: [2511,2383,2461,2472]
00:00:00.546165模拟值:[2501、2395、2467、2465]
in these files the data is sorted this way:00:00:00.283672analogue values:[2511, 2383, 2461, 2472]
00:00:00.546165analogue values:[2501, 2395, 2467, 2465]
我使用以下代码将所有这些文件附加到一个大数据框中:(我从这里得到:将多个excel文件导入python pandas并将它们连接到一个数据帧中)
I append all these files into one big dataframe with this code: (i got from here: Import multiple excel files into python pandas and concatenate them into one dataframe)
file_log = os.listdir(path)
file_log = [file_log for file_log in glob.glob('*device1*')]
df = pd.DataFrame()
for file_log in file_log:
data = pd.read_csv(file_log,sep='analogue values:',names=['time',
'col'], engine='python')
df = data.append(data1)
我转换数据,然后看起来像这样: analog1 Analog2 Analog3 Analog4 time
2511 2383 2461 2472 00:00:00.283672
2501 2395 2467 2465 00:00:00.546165
2501 2395 2467 2465 00:00:00.807846
2497 2381 2461 2467 00:00:01.070540
2485 2391 2458 2475 00:00 :01.332163
I transform the data and then it looks like this:analog1 analog2 analog3 analog4 time
2511 2383 2461 2472 00:00:00.283672
2501 2395 2467 2465 00:00:00.546165
2501 2395 2467 2465 00:00:00.807846
2497 2381 2461 2467 00:00:01.070540
2485 2391 2458 2475 00:00:01.332163
但问题是,我希望时间列为日期时间,其中日期是来自文件名的日期从。
but the problem is, I want the time column to be date time, where the date is the date from the filename it came from.
analog1 Analog2 Analog3 Analog4 datetime
2511 2383 2461 2472 2018-08-31 00:00:00.283672
2501 2395 2467 2465 2018-08-31 00:00:00.546165
2501 2395 2467 2465 2018-08-31 00:00:00.807846
2497 2381 2461 2467 2018-08-31 00:00:01.070540
2485 2391 2458 2475 2018-08-31 00:00:01.332163
您可以将文件名中的前10个值由 file [:10]
转换为日期时间,并添加到由 to_timedelta
。
You can convert first 10 values from filename by file[:10]
to datetime and add to column time
converted by to_timedelta
.
然后追加
每个DataFrame列出并最后使用 concat
Then append
each DataFrame to list and last use concat
dfs = []
for file in glob.glob('*device1*'):
data = pd.read_csv(file,sep='analogue values:',names=['time','col'], engine='python')
data['datetime'] = pd.to_datetime(file[:10]) + pd.to_timedelta(data['time'])
data = data.drop('time', axis=1)
dfs.append(data)
df = pd.concat(dfs, ignore_index=True)