This may be a very simple problem, but I can't seem to get past it. Have column names such as X100.4, X100.-4, X100.-5 so on. I'm trying to run a linear regression but when I do this I get an error
这可能是一个非常简单的问题,但我似乎无法克服它。列名称如X100.4,X100.-4,X100.-5等。我正在尝试运行线性回归,但是当我这样做时,我得到一个错误
lm<-lm(X986~X241+X243+X280+X282+X987+X143.2+X239.0+X491.61+X350.-4,data=train)
Error in terms.formula(formula, data = data) :
invalid model formula in ExtractVars
it works fine without the variable X350.-4, so I'm assuming it's the problem. I tried doing 'X350.-4' and "X350.-4", but this yielded the same error. I also tried doing "" for all of the variables but this also did not work.
它没有变量X350.-4工作正常,所以我假设它是问题。我试过做'X350.-4'和“X350.-4”,但这产生了同样的错误。我也尝试对所有变量做“”,但这也没有用。
2 个解决方案
#1
4
You can use backticks:
你可以使用反引号:
DF <- data.frame(x=1:10, y=rnorm(10))
names(DF)[1] <- "x.-1"
lm(y~`x.-1`, data=DF)
But it would be better to sanitize the names:
但是对名字进行消毒会更好:
names(DF) <- make.names(names(DF))
#2
1
The problem is with the minus sign ("-"), not the decimals. So if you really need these column names, either use @Roland's approach, or replace the minus signs with something else:
问题在于减号(“ - ”),而不是小数。因此,如果您确实需要这些列名称,请使用@ Roland方法,或用其他内容替换减号:
colnames(data)=gsub(pattern="-",x=colnames(data),replacement="_")
Using make.names(...)
is a little dicey because it can generate collisions (multiple columns with the same name). Consider:
使用make.names(...)有点冒险,因为它可以生成冲突(多个具有相同名称的列)。考虑:
DF <- data.frame(y=1:3,x.1=6:8,z=11:13)
colnames(DF)[3] <- "x-1"
DF
y x.1 x-1
1 1 6 11
2 2 7 12
3 3 8 13
names(DF) <- make.names(names(DF))
DF
y x.1 x.1
1 1 6 11
2 2 7 12
3 3 8 13
You may need to use:
您可能需要使用:
names(DF) <- make.names(names(DF),unique=T)
DF
y x.1 x.1.1
1 1 6 11
2 2 7 12
3 3 8 13
#1
4
You can use backticks:
你可以使用反引号:
DF <- data.frame(x=1:10, y=rnorm(10))
names(DF)[1] <- "x.-1"
lm(y~`x.-1`, data=DF)
But it would be better to sanitize the names:
但是对名字进行消毒会更好:
names(DF) <- make.names(names(DF))
#2
1
The problem is with the minus sign ("-"), not the decimals. So if you really need these column names, either use @Roland's approach, or replace the minus signs with something else:
问题在于减号(“ - ”),而不是小数。因此,如果您确实需要这些列名称,请使用@ Roland方法,或用其他内容替换减号:
colnames(data)=gsub(pattern="-",x=colnames(data),replacement="_")
Using make.names(...)
is a little dicey because it can generate collisions (multiple columns with the same name). Consider:
使用make.names(...)有点冒险,因为它可以生成冲突(多个具有相同名称的列)。考虑:
DF <- data.frame(y=1:3,x.1=6:8,z=11:13)
colnames(DF)[3] <- "x-1"
DF
y x.1 x-1
1 1 6 11
2 2 7 12
3 3 8 13
names(DF) <- make.names(names(DF))
DF
y x.1 x.1
1 1 6 11
2 2 7 12
3 3 8 13
You may need to use:
您可能需要使用:
names(DF) <- make.names(names(DF),unique=T)
DF
y x.1 x.1.1
1 1 6 11
2 2 7 12
3 3 8 13