条件连接数据帧R
我有一个简单的问题,我无法正确把握.
I gotta a somewhat simple problem that I'm not being able to grasp correctly.
我有两个数据框,第一个仅包含日期(一整年的每个月),第二个也包含日期和一些其他数据,但是第二个变量中的月份有所变化.像下面这样:
I have two data frames, the first one containing just dates (every month for a bunch of years), the second one also with dates and some other data, but just the months for which there have been changes in the second variable. Like bellow:
df1 <- data.frame(Dates.1 = seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), 'month'))
Dates.2 <- c(seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), by = '5 months'))
Vals <- c(10, 20, 15, 44, 70, 50)
df2 <- data.frame(Dates.2, Vals)
我需要做的是将df1和df2连接起来,将小于或等于df2中日期的"Vals"中对应的值与df1中的每个日期相关联.输出应该像下面这样(我想找到一种矢量化的方式):
What I need to do is to join df1 and df2, associating the corresponding values in "Vals" for each date in df1 which is less than or eaqual to the dates in df2. The output should be as bellow (I want to find a way to do it in a vectorized fashion):
df3 <- cbind(df1,Vals3. = c(10,10,10,10,10,20,20,20,20,20,15,15,15,15,15,
44,44,44,44,44,70,70,70,70,70,50,50,50,50))
我尝试使用dplyr的joins和Fuzzyjoin包,但是我无法正确获取它(我是R语言的初学者).当然,如果任何人都可以使用这些软件包提出解决方案,我将非常高兴.谢谢!
I've tried using the dplyr's joins and the fuzzyjoin package, but I couldn't manage to get it properly (I'm a beginner in R). Of course, if anyone can come up with a solution using these packages I'll be more than glad. Tks!
dplyr
和 tidyr
的组合:
dplyr::left_join(df1,df2,by=c(Dates.1="Dates.2")) %>%
tidyr::fill(Vals,.direction="down")
结果:
Dates.1 Vals
1 1999-01-01 10
2 1999-02-01 10
3 1999-03-01 10
4 1999-04-01 10
5 1999-05-01 10
6 1999-06-01 20
7 1999-07-01 20
8 1999-08-01 20
9 1999-09-01 20
10 1999-10-01 20
(...)
顺便说一句,另一种方法是首先使用 complete
(来自 tidyr
)来跳过创建 df1
的操作:/p>
An alternative, by the way, will be to skip creating df1
in the first place by using complete
(from tidyr
):
tidyr::complete(df2,Dates.2=seq.Date(as.Date('1999/1/1'), as.Date('2001/5/1'), by = 'month')) %>%
tidyr::fill(Vals,.direction="down")
这将产生相同的结果.