计算字符串“天,小时,分钟,秒".至数字总天数

问题描述:

我看到了很多与格式化时间有关的问题,但是没有一个关于我导入的特定格式的问题:

I have seen a lot of questions relating to formatting times, but none in the particular imported format that I have:

Time <- c(
"22 hours 3 minutes 22 seconds", 
"170 hours 15 minutes 20 seconds", 
"39 seconds", 
"2 days 6 hours 44 minutes 17 seconds", 
"9 hours 54 minutes 36 seconds", 
"357 hours 23 minutes 28 seconds", 
"464 hours 30 minutes 7 seconds", 
"51 seconds", 
"31 hours 39 minutes 2 seconds", 
"355 hours 29 minutes 10 seconds")

某些时间仅包含秒",其他时间仅包含分钟和秒",天,小时,分钟和秒",天和秒"等.我还需要保留NA值.如何获取此字符向量以计算(即加天,小时,分钟,秒)总天数?

Some times contain only "seconds", and others "minutes and seconds", "days, hours, minutes and seconds", "days and seconds", etc. There are also NA values that I need to keep. How can I get this character vector to calculate (i.e., add days, hours, minutes, seconds) numeric total days?

例如:

Time
8.10
19.3
0.68
2.28
48.1
0.00
0.70
0.1
3.2
13.9

谢谢!

编辑

古老的问题,但是一个简单的lubridate调用现在可以解决问题:

Old question, but a simple lubridate call does the trick now:

(period_to_seconds(period(time)) / 86400) %>% round(2)

这也可以解决问题,除了需要%>%可读性以外,没有其他软件包:

This also does the trick with no packages other than needing %>% for readability:

Time_vec <- mapply(function(tt, to_days) {
  ifelse(grepl(tt, Time), gsub(paste0("^.*?(\\d+) ", tt, ".*$"), "\\1", Time), 0) %>%
    as.numeric() / to_days
    },
  c("day", "hour", "minute", "second"),
  c(1, 24, 1440, 86400)
) %>%
  apply(1, sum) %>% 
  round(2)

在我的实际数据中,只有一个值与lubridate解决方案不同,即0.960.97.

In my actual data, only one value was different than the lubridate solution, 0.96 vs 0.97.

,没有软件包和少许正则表达式

again, without packages and a little regex

Time <- c(
  "22 hours 3 minutes 22 seconds", 
  "170 hours 15 minutes 20 seconds", 
  "39 seconds", 
  "6 hours 44 minutes 17 seconds", 
  "9 hours 54 minutes 36 seconds", 
  "357 hours 23 minutes 28 seconds", 
  "464 hours 30 minutes 7 seconds", 
  "51 seconds", 
  "31 hours 39 minutes 2 seconds", 
  "355 hours 29 minutes 10 seconds")

pat <- '(?:(\\d+) hours )?(?:(\\d+) minutes )?(?:(\\d+) seconds)?'
m <- regexpr(pat, Time, perl = TRUE)

m_st <- attr(m, 'capture.start')
m_ln <- attr(m, 'capture.length')

(mm <- mapply(function(x, y) as.numeric(substr(Time, x, y)),
              data.frame(m_st), data.frame(m_st + m_ln - 1)))

(dd <- setNames(data.frame(mm), c('h','m','s')))
#      h  m  s
# 1   22  3 22
# 2  170 15 20
# 3   NA NA 39
# 4    6 44 17
# 5    9 54 36
# 6  357 23 28
# 7  464 30  7
# 8   NA NA 51
# 9   31 39  2
# 10 355 29 10

round(rowSums(dd / data.frame(h = rep(24, nrow(dd)), m = 24 * 60, s = 24 * 60 * 60),
        na.rm = TRUE), 3)
# [1]  0.919  7.094  0.000  0.281  0.413 14.891 19.354  0.001  1.319 14.812