如何将文本文件读入R ?

时间:2022-09-10 22:44:41

I'm having problem reading text file into R. The text file has 8 columns and a header which looks exactly like this:

我有一个问题,读取文本文件到r。这个文本文件有8个列和一个标题,看起来是这样的:

ID          1990    1991    1992    1993    1994    1995    1996
A           36.88   45.48   52.46   111.31  138.45  121.09  122.62
B           19.11   27.97   37.14   47.68   60.78   35.84   38.64
C           56.21   74.94   92.3    118.62  138.13  104.65  113.98
D           30.48   51.54   61.57   99.87   80.9    84.97   99.34

When I do the following, I get the error

当我做下面的操作时,我得到了错误。

> extra<- read.table("extrab.txt", header=T, sep="\t")
Error in make.names(col.names, unique = TRUE) : 
  invalid multibyte string at '<ff><fe>I'

So I tried adding fileEnconding

所以我尝试添加fileEnconding。

> extra<- read.table("extrab.txt", header=T, sep="\t", fileEncoding="UCS-2LE")

This worked, but I ended up with a dataframe with one variable where ID to 1996 was treated as one column. Would there be a way to solve this?

这是有效的,但我最后得到了一个dataframe,其中一个变量ID到1996被视为一个列。有办法解决这个问题吗?

I'm adding few more lines on this problem, because I found a different error when I tried to import the file through R 如何将文本文件读入R ?

我在这个问题上增加了更多的行,因为当我试图通过R导入文件时,我发现了一个不同的错误。

2 个解决方案

#1


2  

As per this SO question, the error you're getting seems to be related to file encoding.

根据这个问题,你得到的错误似乎与文件编码有关。

Option 1:

You likely just need to figure out the right file encoding to use.

您可能只需要找到正确的文件编码即可使用。

Example:

例子:

extra<- read.table("extrab.txt", header=T, sep="\t", fileEncoding="latin1")

Option 2:

You can try opening the file in Notepad/whatever text editor and then "save as" using a a common format like ANSI, Unicode or UTF-8.

您可以尝试在记事本/任何文本编辑器中打开文件,然后“保存为”使用一种通用格式,如ANSI、Unicode或UTF-8。

In Windows Notepad, notice there's an "Encoding" dropdown when you SaveAs. ANSI should work fine.

在Windows记事本中,当你保存时,注意有一个“编码”下拉。ANSI可正常工作。

#2


1  

Now that you aren't getting the file encoding problem, it might just be that your separator is actually not a tab. Try:

现在您没有得到文件编码问题,可能只是您的分隔符实际上不是一个选项卡。试一试:

extra<- read.table("extrab.txt", header=T, fileEncoding="UCS-2LE")

This will separate on any whitespace

这将在任何空格中分离。

#1


2  

As per this SO question, the error you're getting seems to be related to file encoding.

根据这个问题,你得到的错误似乎与文件编码有关。

Option 1:

You likely just need to figure out the right file encoding to use.

您可能只需要找到正确的文件编码即可使用。

Example:

例子:

extra<- read.table("extrab.txt", header=T, sep="\t", fileEncoding="latin1")

Option 2:

You can try opening the file in Notepad/whatever text editor and then "save as" using a a common format like ANSI, Unicode or UTF-8.

您可以尝试在记事本/任何文本编辑器中打开文件,然后“保存为”使用一种通用格式,如ANSI、Unicode或UTF-8。

In Windows Notepad, notice there's an "Encoding" dropdown when you SaveAs. ANSI should work fine.

在Windows记事本中,当你保存时,注意有一个“编码”下拉。ANSI可正常工作。

#2


1  

Now that you aren't getting the file encoding problem, it might just be that your separator is actually not a tab. Try:

现在您没有得到文件编码问题,可能只是您的分隔符实际上不是一个选项卡。试一试:

extra<- read.table("extrab.txt", header=T, fileEncoding="UCS-2LE")

This will separate on any whitespace

这将在任何空格中分离。