阅读带有日期和数字的csv

时间:2021-06-25 20:08:01

I have a problem when I import a csv file with R:

当我导入一个带有R的csv文件时,我有一个问题:

example lines to import:

行导入示例:

2010-07-27;91
2010-07-26;93
2010-07-23;88

I use the statement:

我使用的语句:

data <- read.csv2(file="...", sep=";", dec=".", header=FALSE)

when I try to aggregate this data with other ones originated by statistical analysis using cbind, the date is showed as an integer number because it was imported as factor.

当我试图将这些数据与使用cbind的统计分析生成的其他数据聚合在一起时,数据被显示为一个整数,因为它是作为因子导入的。

If I try to show it as a string using as.character, the numerical data are transformed into characters too so they are unusable for statistical procedures.

如果我用as来表示它是一个字符串。字符,数值数据也被转换成字符,因此它们不能用于统计程序。

3 个解决方案

#1


25  

Use colClasses argument:

使用colClasses论点:

data <- read.csv2(file="...", sep=";", dec=".", header=FALSE,
     colClasses=c("Date",NA))

NA means "proceed as default"

NA的意思是“继续作为默认”

After import you could convert factor to Date by

导入之后,您可以将因子转换为Date by

data[[1]] <- as.Date(data[[1]])

#2


9  

Perhaps you want to convert the character values to meaningful time values. In that case POSIXt time objects are a good choice.

也许您希望将字符值转换为有意义的时间值。在这种情况下,POSIXt时间对象是一个很好的选择。

Given your data file I'd do something like.

有了你的数据文件,我会做一些类似的事情。

data <- read.table(file="...", sep = ";", as.is = TRUE)
data[,1] <- strptime(data[,1], "%Y-%m-%d")

Look up strptime in help for more details.

查询strptime以获得更多细节。

NOTE: If you're going to specify all the properties of the file just use read.table. The only purpose for all of the other read.xxx versions is to simplify the expression because the defaults are set. Here you used read.csv2 because it defaults to sep = ';'. Therefore, don't specify it again. Not having to specify that is the entire reason the command exists. Personally, I only use read.table because I can never remember the names/defaults of all the variants. In your case it's also the briefest call because it satisfies your header and dec defaults.

注意:如果要指定文件的所有属性,请使用read.table。所有其他阅读的唯一目的。xxx版本是为了简化表达式,因为设置了默认值。因为它默认为sep = ';因此,不要再指定它。不需要指定该命令存在的全部原因。就我个人而言,我只用read。表,因为我不记得所有变量的名称/默认值。在你的情况下,它也是最简短的调用,因为它满足你的头和dec的默认值。

#3


7  

Add as.is=TRUE to the read.csv call.

添加。是读= TRUE。csv电话。

#1


25  

Use colClasses argument:

使用colClasses论点:

data <- read.csv2(file="...", sep=";", dec=".", header=FALSE,
     colClasses=c("Date",NA))

NA means "proceed as default"

NA的意思是“继续作为默认”

After import you could convert factor to Date by

导入之后,您可以将因子转换为Date by

data[[1]] <- as.Date(data[[1]])

#2


9  

Perhaps you want to convert the character values to meaningful time values. In that case POSIXt time objects are a good choice.

也许您希望将字符值转换为有意义的时间值。在这种情况下,POSIXt时间对象是一个很好的选择。

Given your data file I'd do something like.

有了你的数据文件,我会做一些类似的事情。

data <- read.table(file="...", sep = ";", as.is = TRUE)
data[,1] <- strptime(data[,1], "%Y-%m-%d")

Look up strptime in help for more details.

查询strptime以获得更多细节。

NOTE: If you're going to specify all the properties of the file just use read.table. The only purpose for all of the other read.xxx versions is to simplify the expression because the defaults are set. Here you used read.csv2 because it defaults to sep = ';'. Therefore, don't specify it again. Not having to specify that is the entire reason the command exists. Personally, I only use read.table because I can never remember the names/defaults of all the variants. In your case it's also the briefest call because it satisfies your header and dec defaults.

注意:如果要指定文件的所有属性,请使用read.table。所有其他阅读的唯一目的。xxx版本是为了简化表达式,因为设置了默认值。因为它默认为sep = ';因此,不要再指定它。不需要指定该命令存在的全部原因。就我个人而言,我只用read。表,因为我不记得所有变量的名称/默认值。在你的情况下,它也是最简短的调用,因为它满足你的头和dec的默认值。

#3


7  

Add as.is=TRUE to the read.csv call.

添加。是读= TRUE。csv电话。