使用 R 在回归中循环协变量

问题描述:

我正在尝试运行 96 个回归并将结果保存为 96 个不同的对象.更复杂的是,我希望模型中协变量之一的下标也更改 96 次.我几乎解决了这个问题,但不幸的是我碰壁了.到目前为止的代码是,

I'm trying to run 96 regressions and save the results as 96 different objects. To complicate things, I want the subscript on one of the covariates in the model to also change 96 times. I've almost solved the problem but I've unfortunately hit a wall. The code so far is,

for(i in 1:96){

  assign(paste("z.out", i,sep=""), lm(rMonExp_EGM~ TE_i + Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
  Month10+Month11+Month12+Yrs_minus_2004 + 
  as.factor(LGA),data=Pokies))

}

这适用于对象创建方面(例如,我有 z.out1 - z.out96),但我似乎无法让协变量的下标也改变.

This works on the object creation side (e.g. I have z.out1 - z.out96) but I can't seem to get the subscript on the covariate to change as well.

我在数据集中有 96 个变量,称为 TE_1、TE_2 ... TE_96.因此,TE_ 上的下标i"需要更改以对应于我创建的每个对象.也就是说,z.out1 应该保存这个模型的结果:

I have 96 variables called TE_1, TE_2 ... TE_96 in the dataset. As such, the subscript on TE_, the "i" needs to change to correspond to each of the objects I create. That is, z.out1 should hold the results from this model:

z.out1 <- lm(rMonExp_EGM~ TE_1 + Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
  Month10+Month11+Month12+Yrs_minus_2004 + as.factor(LGA),data=Pokies)

z.out96 应该是:

And z.out96 should be:

z.out96 <- lm(rMonExp_EGM~ TE_96+ Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
  Month10+Month11+Month12+Yrs_minus_2004 + as.factor(LGA),data=Pokies)

希望这是有道理的.我很感激任何提示/建议.

Hopefully this makes sense. I'm grateful for any tips/advice.

我会将结果放在一个列表中并避免 for 循环assign 语句

I would put the results in a list and avoid the for loop and assign statements

您可以结合使用 reformulateupdate 来创建您的公式

You can use a combination of reformulate and update to create your formula

orig_formula <- MonExp_EGM~ Month2+Month3+Month4+Month5+Month6+Month7+Month8+Month9+
 Month10+Month11+Month12+Yrs_minus_2004 + as.factor(LGA)


te_variables <- paste0('TE_', 1:96) 
# Or if you don't have a current version of R
# te_variables <- paste('TE', 1:96, sep = '_')  

 new_formula <- lapply(te_variables, function(x,orig = orig_formula) { 
    new <- reformulate(c(x,'.'))
    update(orig, new)})
 ## it works!    
new_formula[[1]]
## MonExp_EGM ~ TE_1 + Month2 + Month3 + Month4 + Month5 + Month6 + 
##   Month7 + Month8 + Month9 + Month10 + Month11 + Month12 + 
##   Yrs_minus_2004 + as.factor(LGA)
new_formula[[2]]
## MonExp_EGM ~ TE_2 + Month2 + Month3 + Month4 + Month5 + Month6 + 
## Month7 + Month8 + Month9 + Month10 + Month11 + Month12 + 
## Yrs_minus_2004 + as.factor(LGA)


models <- lapply(new_formula, lm, data = pokies)

列表中现在应该有 96 个元素 models

There should now be 96 elements in the list models

您可以命名它们以反映您最初计划的 nnames

You can name them to reflect your originally planned nnames

names(models) <- paste0('z.out', 1:96)
# or if you don't have a current version of R
# names(models) <-paste('z.out', 1:96 ,sep = '' )  

然后通过

 models$z.out5

或创建所有模型的摘要

 summaries <- lapply(models, summary)

等等....

 # just the coefficients
 coefficients <- lapply(models, coef)

 # the table with coefficient estimates and standard.errors

 coef_tables <- apply(summaries, '[[', 'coefficients')