在plm中循环子集

我正在尝试在R中编写一些非常简单的(我认为)的东西,但我似乎无法做到。 我有一个包含50个国家(1至50)的数据集,每个国家15年,每个国家约20个变量。 现在我只在我的因变量( SMD )上测试一个变量( OS )。 我想按照国家的循环国家来做这件事,这样我就能得到每个国家的产量而不是总产量。

我认为首先创建一个子集是明智的(首先能够看到国家1,之后我的循环应该增加国家和测试国家2的数量)。 我相信我在页面底部的回归应该会给我国1的输出,而不是整个数据集的总体分数。 但是我不断收到这些错误:

> pdata <- plm.data(newdata, index=c("Country","Date"))
  series    are constants and have been removed
> pooling <- plm(Y ~ X, data=pdata, model= "pooling") 
  series Country, xRegion are constants and have been removed
  Error in model.matrix.pFormula(formula, data, rhs = 1, model = model,  : 
  NA in the individual index variable
> summary(pooling)
  Error in summary(pooling) : object 'pooling' not found

我可能会看到这一切都是错误的,但我相信,如果没有这个工作,没有必要继续编程循环本身。 任何关于解决我的错误的建议,或者其他编程循环的方法都非常感谢。

我的代码:

rm(list = ls())
mydata <- read.table(file = file.choose(), header = TRUE, dec = ",")
names(mydata)
attach(mydata)

Y <- cbind(SMD)
X <- cbind(OS)

newdata <- subset(mydata, Country %in% c(1))

newdata

pdata <- plm.data(newdata, index=c("Country","Date"))
pooling <- plm(Y ~ X, data=pdata, model= "pooling") 
summary(pooling)

编辑:导致相同错误的前2个国家/地区的数据样本

(1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L, 1L,1L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L)国家= c(1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L, 1L,1L,1L,1L,1L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,2L,日期= c(1995L,1996L,1997L,1998L,1999L,2000L,2001L,2002L,2003L,2004L,2005L,2006L,2007L,2008L,2009L,2010L,2011L,2012L,2013L,2014L,1995L,1996L,1997L ,1998L,1999L,2000L,2001L,2002L,2003L,2004L,2005L,2006L,2007L,2008L,2009L,2010L,2011L,2012L,2013L,2014L),OS = structure(c(19L,25L,27L,15L, 22L,20L,23L,9L,7L,5L,2L,1L,4L,3L,6L,10L,11L,13L,11L,8L,26L,25L,31L,29L,28L,21L,30L,24L,24L, 16L,11L,14L,12L,17L,18L,29L,32L,32L,33L,34L)。标签= c(“51.5”,“52.2”,“55.6”,“56.4”,“56.7”,“57.7 “,”57.8“,”58。 3“”59“”59.2“”59.6“”59.9“”60.2“”60.4“”61.1“”61.2“62.2”62.3“62.8”63.2“ “63.3”“63.8”“63.9”“64.2”“64.3”“64.5”64.7“65.3”65.5“65.6”66.4“68” (7L,12L,20L,21L,17L,15L,13L,10L,14L,22L,23L,33L,1L,32L,29L) ,34L,28L,25L,NA,NA,9L,6L,8L,4L,2L,35L,3L,36L,5L,11L,16L,18L,24L,19L,26L,31L,27L,30L, ),.Label = c(“100.3565662”,“13.44788845”,“13.45858747”,“13.56815534”,“15.05892471”,“17.63789658”,“18.04088718”,“18.3101351”,“19.34226196”,“21.25530884”,“21.54423145” 23.75898948“24.08770926”26.39817342“29.44079001”31.40605191“34.46667996”34.52913657“35.66070947”36.4419931“39.16875621”44.0126137“45.72949566” 49.13062679“”54.83730247“”56.87886311“”59.80971583“”60.5658962“”69.20148901“”70.91362874“72.64845214”73.97139238“75.20140919”76.18378138“9.570435019”9.867635305“ ),class =“factor”)) ,.Names = c(“Region”,“Country”,“Date”,“OS”,“SMD”),class =“data.frame”,row.names = c(NA,-40L))


你确定你需要使用plm吗? 这产生了按国家的摘要清单。

# convert factors to numeric
mydata$SMD <- as.numeric(mydata$SMD)
mydata$OS  <- as.numeric(mydata$OS)

# Using lapply(...)
smry <- lapply(unique(mydata$Country),
               function(cntry)
                 summary(lm(SMD~OS,data=mydata[mydata$Country==cntry,])))
# Same thing, using for loop
smry <- list()
for (cntry in unique(mydata$Country)) {
  smry <- list(smry, 
               summary(lm(SMD~OS,data=mydata[mydata$Country==cntry,])))
}

在您的数据集中, SMDOS是因素,需要首先将其转换为数字。

链接地址: http://www.djcxy.com/p/64567.html

上一篇: Looping subsets in plm

下一篇: Speech to Text API