如何将数据帧列转换为数值类型?

时间:2022-01-03 16:17:40

How do you convert a data frame column to a numeric type?

如何将数据框架列转换为数值类型?

15 个解决方案

#1


217  

Since (still) nobody got check-mark, I assume that you have some practical issue in mind, mostly because you haven't specified what type of vector you want to convert to numeric. I suggest that you should apply transform function in order to complete your task.

由于(仍然)没有人得到检查标记,我假设您有一些实际的问题,主要是因为您没有指定要转换为数值的向量类型。我建议您应该应用转换函数来完成您的任务。

Now I'm about to demonstrate certain "conversion anomaly":

现在我要演示一些“转换异常”:

# create dummy data.frame
d <- data.frame(char = letters[1:5], 
                fake_char = as.character(1:5), 
                fac = factor(1:5), 
                char_fac = factor(letters[1:5]), 
                num = 1:5, stringsAsFactors = FALSE)

Let us have a glance at data.frame

让我们粗略地看一下数据。

> d
  char fake_char fac char_fac num
1    a         1   1        a   1
2    b         2   2        b   2
3    c         3   3        c   3
4    d         4   4        d   4
5    e         5   5        e   5

and let us run:

让我们运行:

> sapply(d, mode)
       char   fake_char         fac    char_fac         num 
"character" "character"   "numeric"   "numeric"   "numeric" 
> sapply(d, class)
       char   fake_char         fac    char_fac         num 
"character" "character"    "factor"    "factor"   "integer" 

Now you probably ask yourself "Where's an anomaly?" Well, I've bumped into quite peculiar things in R, and this is not the most confounding thing, but it can confuse you, especially if you read this before rolling into bed.

现在你可能会问自己“异常在哪里?”我在R中遇到了一些奇怪的事情,这并不是最令人困惑的事情,但它会让你困惑,尤其是当你在上床睡觉之前读到这篇文章的时候。

Here goes: first two columns are character. I've deliberately called 2nd one fake_char. Spot the similarity of this character variable with one that Dirk created in his reply. It's actually a numerical vector converted to character. 3rd and 4th column are factor, and the last one is "purely" numeric.

这里有:前两列是字符。我故意把第二个fake_char命名为fake_char。在他的回答中,发现了这个角色变量与德克的相似之处。它实际上是一个转换为字符的数值向量。第三和第四列是因子,最后一个是“纯”数值。

If you utilize transform function, you can convert the fake_char into numeric, but not the char variable itself.

如果使用转换函数,可以将fake_char转换为数值,而不是char变量本身。

> transform(d, char = as.numeric(char))
  char fake_char fac char_fac num
1   NA         1   1        a   1
2   NA         2   2        b   2
3   NA         3   3        c   3
4   NA         4   4        d   4
5   NA         5   5        e   5
Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercion

but if you do same thing on fake_char and char_fac, you'll be lucky, and get away with no NA's:

但是,如果你在fake_char和char_fac上做同样的事情,你就会很幸运,并且不需要任何NA's:

> transform(d, fake_char = as.numeric(fake_char), 
               char_fac = as.numeric(char_fac))

  char fake_char fac char_fac num
1    a         1   1        1   1
2    b         2   2        2   2
3    c         3   3        3   3
4    d         4   4        4   4
5    e         5   5        5   5

If you save transformed data.frame and check for mode and class, you'll get:

如果您保存已转换的数据。帧并检查模式和类,您将得到:

> D <- transform(d, fake_char = as.numeric(fake_char), 
                    char_fac = as.numeric(char_fac))

> sapply(D, mode)
       char   fake_char         fac    char_fac         num 
"character"   "numeric"   "numeric"   "numeric"   "numeric" 
> sapply(D, class)
       char   fake_char         fac    char_fac         num 
"character"   "numeric"    "factor"   "numeric"   "integer"

So, the conclusion is: Yes, you can convert character vector into a numeric one, but only if it's elements are "convertible" to numeric. If there's just one character element in vector, you'll get error when trying to convert that vector to numerical one.

因此,结论是:是的,您可以将字符向量转换为数值型向量,但前提是它的元素是“可转换的”到数值。如果向量中只有一个字符元素,那么当试图将这个向量转换成数值时,就会出错。

And just to prove my point:

为了证明我的观点

> err <- c(1, "b", 3, 4, "e")
> mode(err)
[1] "character"
> class(err)
[1] "character"
> char <- as.numeric(err)
Warning message:
NAs introduced by coercion 
> char
[1]  1 NA  3  4 NA

And now, just for fun (or practice), try to guess the output of these commands:

现在,为了好玩(或练习),试着猜测这些命令的输出:

> fac <- as.factor(err)
> fac
???
> num <- as.numeric(fac)
> num
???

Kind regards to Patrick Burns! =)

向帕特里克·伯恩斯致敬!=)

#2


109  

Something that has helped me: if you have ranges of variables to convert (or just more then one), you can use sapply.

一些帮助我的事情:如果你有很多变量可以转换(或者仅仅是一个),你可以使用sapply。

A bit nonsensical but just for example:

有点荒谬,但只是举个例子:

data(cars)
cars[, 1:2] <- sapply(cars[, 1:2], as.factor)

Say columns 3, 6-15 and 37 of you dataframe need to be converted to numeric one could:

比如,第3、6-15和37号的dataframe需要转换为数值模式:

dat[, c(3,6:15,37)] <- sapply(dat[, c(3,6:15,37)], as.numeric)

#3


62  

if x is the column name of dataframe dat, and x is of type factor, use:

如果x是dataframe dat的列名,而x是类型因素,请使用:

as.numeric(as.character(dat$x))

#4


17  

I would have added a comment (cant low rating)

我会添加评论(不能低评级)

Just to add on user276042 and pangratz

只需添加user276042和pangratz。

dat$x = as.numeric(as.character(dat$x))

This will override the values of existing column x

这将覆盖现有列x的值。

#5


14  

Tim is correct, and Shane has an omission. Here are additional examples:

蒂姆是对的,谢恩有个疏忽。这里有更多的例子:

R> df <- data.frame(a = as.character(10:15))
R> df <- data.frame(df, num = as.numeric(df$a), 
                        numchr = as.numeric(as.character(df$a)))
R> df
   a num numchr
1 10   1     10
2 11   2     11
3 12   3     12
4 13   4     13
5 14   5     14
6 15   6     15
R> summary(df)
  a          num           numchr    
 10:1   Min.   :1.00   Min.   :10.0  
 11:1   1st Qu.:2.25   1st Qu.:11.2  
 12:1   Median :3.50   Median :12.5  
 13:1   Mean   :3.50   Mean   :12.5  
 14:1   3rd Qu.:4.75   3rd Qu.:13.8  
 15:1   Max.   :6.00   Max.   :15.0  
R> 

Our data.frame now has a summary of the factor column (counts) and numeric summaries of the as.numeric() --- which is wrong as it got the numeric factor levels --- and the (correct) summary of the as.numeric(as.character()).

我们的data.frame现在有一个关于as.numeric()的因子列(计数)和数字摘要的摘要——它是错误的,因为它得到了数字因子级别——以及(正确的)as.numeric(as.character())的摘要。

#6


12  

With the following code you can convert all data frame columns to numeric (X is the data frame that we want to convert it's columns):

使用下面的代码,您可以将所有数据框架列转换为数值(X是我们想要转换为列的数据帧):

as.data.frame(lapply(X, as.numeric))

and for converting whole matrix into numeric you have two ways: Either:

对于将整个矩阵转换成数值,你有两种方法:

mode(X) <- "numeric"

or:

或者:

X <- apply(X, 2, as.numeric)

Alternatively you can use data.matrix function to convert everything into numeric, although be aware that the factors might not get converted correctly, so it is safer to convert everything to character first:

或者,您也可以使用数据。矩阵函数将一切转换为数字,尽管要知道这些因素可能无法正确转换,所以将一切转换为字符更安全:

X <- sapply(X, as.character)
X <- data.matrix(X)

I usually use this last one if I want to convert to matrix and numeric simultaneously

如果我想同时转换成矩阵和数值,我通常使用最后一个。

#7


8  

If you run into problems with:

如果你遇到问题:

as.numeric(as.character(dat$x))

Take a look to your decimal marks. If they are "," instead of "." (e.g. "5,3") the above won't work.

看一看你的十进制分数。如果他们是“,”而不是“。”(如。“5,3”)以上都不行。

A potential solution is:

一个潜在的解决方案是:

as.numeric(gsub(",", ".", dat$x))

I believe this is quite common in some non English speaking countries.

我相信这在一些非英语国家很常见。

#8


6  

Universal way using type.convert() and rapply():

使用type.convert()和rapply()的通用方法:

convert_types <- function(x) {
    stopifnot(is.list(x))
    x[] <- rapply(x, utils::type.convert, classes = "character",
                  how = "replace", as.is = TRUE)
    return(x)
}
d <- data.frame(char = letters[1:5], 
                fake_char = as.character(1:5), 
                fac = factor(1:5), 
                char_fac = factor(letters[1:5]), 
                num = 1:5, stringsAsFactors = FALSE)
sapply(d, class)
#>        char   fake_char         fac    char_fac         num 
#> "character" "character"    "factor"    "factor"   "integer"
sapply(convert_types(d), class)
#>        char   fake_char         fac    char_fac         num 
#> "character"   "integer"    "factor"    "factor"   "integer"

#9


4  

While your question is strictly on numeric, there are many conversions that are difficult to understand when beginning R. I'll aim to address methods to help. This question is similar to This Question.

虽然你的问题是严格的数字,但有很多的转换是很难理解的,当开始r。我将致力于解决方法来帮助。这个问题与这个问题相似。

Type conversion can be a pain in R because (1) factors can't be converted directly to numeric, they need to be converted to character class first, (2) dates are a special case that you typically need to deal with separately, and (3) looping across data frame columns can be tricky. Fortunately, the "tidyverse" has solved most of the issues.

类型转换在R中可能是一种痛苦,因为(1)因素不能直接转换为数字,它们需要首先转换成字符类,(2)日期是您通常需要单独处理的特殊情况,并且(3)跨数据框架列的循环可能会很棘手。幸运的是,“tidyverse”已经解决了大部分问题。

This solution uses mutate_each() to apply a function to all columns in a data frame. In this case, we want to apply the type.convert() function, which converts strings to numeric where it can. Because R loves factors (not sure why) character columns that should stay character get changed to factor. To fix this, the mutate_if() function is used to detect columns that are factors and change to character. Last, I wanted to show how lubridate can be used to change a timestamp in character class to date-time because this is also often a sticking block for beginners.

该解决方案使用mutate_each()将函数应用到数据框架中的所有列。在本例中,我们希望应用type.convert()函数,它可以将字符串转换为数值。因为R喜欢的因素(不确定为什么)字符列应该保持字符被更改为因子。为了解决这个问题,mutate_if()函数用于检测那些是影响因素并改变字符的列。最后,我想展示如何使用润滑来更改字符类中的时间戳,因为这通常也是初学者的一个难题。


library(tidyverse) 
library(lubridate)

# Recreate data that needs converted to numeric, date-time, etc
data_df
#> # A tibble: 5 × 9
#>             TIMESTAMP SYMBOL    EX  PRICE  SIZE  COND   BID BIDSIZ   OFR
#>                 <chr>  <chr> <chr>  <chr> <chr> <chr> <chr>  <chr> <chr>
#> 1 2012-05-04 09:30:00    BAC     T 7.8900 38538     F  7.89    523  7.90
#> 2 2012-05-04 09:30:01    BAC     Z 7.8850   288     @  7.88  61033  7.90
#> 3 2012-05-04 09:30:03    BAC     X 7.8900  1000     @  7.88   1974  7.89
#> 4 2012-05-04 09:30:07    BAC     T 7.8900 19052     F  7.88   1058  7.89
#> 5 2012-05-04 09:30:08    BAC     Y 7.8900 85053     F  7.88 108101  7.90

# Converting columns to numeric using "tidyverse"
data_df %>%
    mutate_each(funs(type.convert)) %>%
    mutate_if(is.factor, as.character) %>%
    mutate(TIMESTAMP = as_datetime(TIMESTAMP, tz = Sys.timezone()))
#> # A tibble: 5 × 9
#>             TIMESTAMP SYMBOL    EX PRICE  SIZE  COND   BID BIDSIZ   OFR
#>                <dttm>  <chr> <chr> <dbl> <int> <chr> <dbl>  <int> <dbl>
#> 1 2012-05-04 09:30:00    BAC     T 7.890 38538     F  7.89    523  7.90
#> 2 2012-05-04 09:30:01    BAC     Z 7.885   288     @  7.88  61033  7.90
#> 3 2012-05-04 09:30:03    BAC     X 7.890  1000     @  7.88   1974  7.89
#> 4 2012-05-04 09:30:07    BAC     T 7.890 19052     F  7.88   1058  7.89
#> 5 2012-05-04 09:30:08    BAC     Y 7.890 85053     F  7.88 108101  7.90

#10


3  

To convert a data frame column to numeric you just have to do:-

要将数据帧列转换成数字,您只需做:-。

factor to numeric:-

因素数值:-

data_frame$column <- as.numeric(as.character(data_frame$column))

#11


2  

Though others have covered the topic pretty well, I'd like to add this additional quick thought/hint. You could use regexp to check in advance whether characters potentially consist only of numerics.

虽然其他人已经很好地讨论了这个话题,但是我想补充一下这个额外的快速思考/暗示。您可以使用regexp提前检查字符是否仅包含数字。

for(i in seq_along(names(df)){
     potential_numcol[i] <- all(!grepl("[a-zA-Z]",d[,i]))
}
# and now just convert only the numeric ones
d <- sapply(d[,potential_numcol],as.numeric)

For more sophisticated regular expressions and a neat why to learn/experience their power see this really nice website: http://regexr.com/

对于更复杂的正则表达式,以及为什么要学习/体验他们的力量,请访问这个非常好的网站:http://regexr.com/。

#12


0  

In my PC (R v.3.2.3), apply or sapply give error. lapply works well.

在我的PC (R .3.2.3)中,应用或sapply给出错误。拉普兰人的作品。

dt[,2:4] <- lapply(dt[,2:4], function (x) as.factor(as.numeric(x)))

#13


0  

To convert character to numeric you have to convert it into factor by applying

要将字符转换为数值,您必须通过应用程序将其转换为因子。

BankFinal1 <- transform(BankLoan,   LoanApproval=as.factor(LoanApproval))
BankFinal1 <- transform(BankFinal1, LoanApp=as.factor(LoanApproval))

You have to make two columns with the same data, because one column cannot convert into numeric. If you do one conversion it gives the below error

您必须使用相同的数据生成两个列,因为一个列不能转换为数值。如果你做一个转换,它会给出下面的错误。

transform(BankData, LoanApp=as.numeric(LoanApproval))
Warning message:
  In eval(substitute(list(...)), `_data`, parent.frame()) :
  NAs introduced by coercion

so, after doing two column of the same data apply

所以,在做了两列相同的数据之后。

BankFinal1 < transform(BankFinal1, LoanApp      = as.numeric(LoanApp), 
                                   LoanApproval = as.numeric(LoanApproval))

it will transform the character to numeric successfully

它将成功地将字符转换为数值。

#14


0  

If the dataframe has multiple types of columns, some characters, some numeric try the following to convert just the columns that contain numeric values to numeric:

如果dataframe有多种类型的列,一些字符,一些数字尝试下面的内容来转换包含数字值的列:

for (i in 1:length(data[1,])){
  if(length(as.numeric(data[,i][!is.na(data[,i])])[!is.na(as.numeric(data[,i][!is.na(data[,i])]))])==0){}
  else {
    data[,i]<-as.numeric(data[,i])
  }
}

#15


0  

Considering there might exist char columns, this is based on @Abdou in Get column types of excel sheet automatically answer:

考虑到可能存在char列,这是基于@Abdou在获取列类型的excel表格中自动回答的:

makenumcols<-function(df){
df<-as.data.frame(df)
cond <- apply(df, 2, function(x) {
  x <- x[!is.na(x)]
  all(suppressWarnings(!is.na(as.numeric(x))))
})
numeric_cols <- names(df)[cond]
df[,numeric_cols] <- apply(df[,numeric_cols],2, as.character) # deals with factors
df[,numeric_cols] <- sapply(df[,numeric_cols], as.numeric)
return(df)
}
df<-makenumcols(df)

#1


217  

Since (still) nobody got check-mark, I assume that you have some practical issue in mind, mostly because you haven't specified what type of vector you want to convert to numeric. I suggest that you should apply transform function in order to complete your task.

由于(仍然)没有人得到检查标记,我假设您有一些实际的问题,主要是因为您没有指定要转换为数值的向量类型。我建议您应该应用转换函数来完成您的任务。

Now I'm about to demonstrate certain "conversion anomaly":

现在我要演示一些“转换异常”:

# create dummy data.frame
d <- data.frame(char = letters[1:5], 
                fake_char = as.character(1:5), 
                fac = factor(1:5), 
                char_fac = factor(letters[1:5]), 
                num = 1:5, stringsAsFactors = FALSE)

Let us have a glance at data.frame

让我们粗略地看一下数据。

> d
  char fake_char fac char_fac num
1    a         1   1        a   1
2    b         2   2        b   2
3    c         3   3        c   3
4    d         4   4        d   4
5    e         5   5        e   5

and let us run:

让我们运行:

> sapply(d, mode)
       char   fake_char         fac    char_fac         num 
"character" "character"   "numeric"   "numeric"   "numeric" 
> sapply(d, class)
       char   fake_char         fac    char_fac         num 
"character" "character"    "factor"    "factor"   "integer" 

Now you probably ask yourself "Where's an anomaly?" Well, I've bumped into quite peculiar things in R, and this is not the most confounding thing, but it can confuse you, especially if you read this before rolling into bed.

现在你可能会问自己“异常在哪里?”我在R中遇到了一些奇怪的事情,这并不是最令人困惑的事情,但它会让你困惑,尤其是当你在上床睡觉之前读到这篇文章的时候。

Here goes: first two columns are character. I've deliberately called 2nd one fake_char. Spot the similarity of this character variable with one that Dirk created in his reply. It's actually a numerical vector converted to character. 3rd and 4th column are factor, and the last one is "purely" numeric.

这里有:前两列是字符。我故意把第二个fake_char命名为fake_char。在他的回答中,发现了这个角色变量与德克的相似之处。它实际上是一个转换为字符的数值向量。第三和第四列是因子,最后一个是“纯”数值。

If you utilize transform function, you can convert the fake_char into numeric, but not the char variable itself.

如果使用转换函数,可以将fake_char转换为数值,而不是char变量本身。

> transform(d, char = as.numeric(char))
  char fake_char fac char_fac num
1   NA         1   1        a   1
2   NA         2   2        b   2
3   NA         3   3        c   3
4   NA         4   4        d   4
5   NA         5   5        e   5
Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercion

but if you do same thing on fake_char and char_fac, you'll be lucky, and get away with no NA's:

但是,如果你在fake_char和char_fac上做同样的事情,你就会很幸运,并且不需要任何NA's:

> transform(d, fake_char = as.numeric(fake_char), 
               char_fac = as.numeric(char_fac))

  char fake_char fac char_fac num
1    a         1   1        1   1
2    b         2   2        2   2
3    c         3   3        3   3
4    d         4   4        4   4
5    e         5   5        5   5

If you save transformed data.frame and check for mode and class, you'll get:

如果您保存已转换的数据。帧并检查模式和类,您将得到:

> D <- transform(d, fake_char = as.numeric(fake_char), 
                    char_fac = as.numeric(char_fac))

> sapply(D, mode)
       char   fake_char         fac    char_fac         num 
"character"   "numeric"   "numeric"   "numeric"   "numeric" 
> sapply(D, class)
       char   fake_char         fac    char_fac         num 
"character"   "numeric"    "factor"   "numeric"   "integer"

So, the conclusion is: Yes, you can convert character vector into a numeric one, but only if it's elements are "convertible" to numeric. If there's just one character element in vector, you'll get error when trying to convert that vector to numerical one.

因此,结论是:是的,您可以将字符向量转换为数值型向量,但前提是它的元素是“可转换的”到数值。如果向量中只有一个字符元素,那么当试图将这个向量转换成数值时,就会出错。

And just to prove my point:

为了证明我的观点

> err <- c(1, "b", 3, 4, "e")
> mode(err)
[1] "character"
> class(err)
[1] "character"
> char <- as.numeric(err)
Warning message:
NAs introduced by coercion 
> char
[1]  1 NA  3  4 NA

And now, just for fun (or practice), try to guess the output of these commands:

现在,为了好玩(或练习),试着猜测这些命令的输出:

> fac <- as.factor(err)
> fac
???
> num <- as.numeric(fac)
> num
???

Kind regards to Patrick Burns! =)

向帕特里克·伯恩斯致敬!=)

#2


109  

Something that has helped me: if you have ranges of variables to convert (or just more then one), you can use sapply.

一些帮助我的事情:如果你有很多变量可以转换(或者仅仅是一个),你可以使用sapply。

A bit nonsensical but just for example:

有点荒谬,但只是举个例子:

data(cars)
cars[, 1:2] <- sapply(cars[, 1:2], as.factor)

Say columns 3, 6-15 and 37 of you dataframe need to be converted to numeric one could:

比如,第3、6-15和37号的dataframe需要转换为数值模式:

dat[, c(3,6:15,37)] <- sapply(dat[, c(3,6:15,37)], as.numeric)

#3


62  

if x is the column name of dataframe dat, and x is of type factor, use:

如果x是dataframe dat的列名,而x是类型因素,请使用:

as.numeric(as.character(dat$x))

#4


17  

I would have added a comment (cant low rating)

我会添加评论(不能低评级)

Just to add on user276042 and pangratz

只需添加user276042和pangratz。

dat$x = as.numeric(as.character(dat$x))

This will override the values of existing column x

这将覆盖现有列x的值。

#5


14  

Tim is correct, and Shane has an omission. Here are additional examples:

蒂姆是对的,谢恩有个疏忽。这里有更多的例子:

R> df <- data.frame(a = as.character(10:15))
R> df <- data.frame(df, num = as.numeric(df$a), 
                        numchr = as.numeric(as.character(df$a)))
R> df
   a num numchr
1 10   1     10
2 11   2     11
3 12   3     12
4 13   4     13
5 14   5     14
6 15   6     15
R> summary(df)
  a          num           numchr    
 10:1   Min.   :1.00   Min.   :10.0  
 11:1   1st Qu.:2.25   1st Qu.:11.2  
 12:1   Median :3.50   Median :12.5  
 13:1   Mean   :3.50   Mean   :12.5  
 14:1   3rd Qu.:4.75   3rd Qu.:13.8  
 15:1   Max.   :6.00   Max.   :15.0  
R> 

Our data.frame now has a summary of the factor column (counts) and numeric summaries of the as.numeric() --- which is wrong as it got the numeric factor levels --- and the (correct) summary of the as.numeric(as.character()).

我们的data.frame现在有一个关于as.numeric()的因子列(计数)和数字摘要的摘要——它是错误的,因为它得到了数字因子级别——以及(正确的)as.numeric(as.character())的摘要。

#6


12  

With the following code you can convert all data frame columns to numeric (X is the data frame that we want to convert it's columns):

使用下面的代码,您可以将所有数据框架列转换为数值(X是我们想要转换为列的数据帧):

as.data.frame(lapply(X, as.numeric))

and for converting whole matrix into numeric you have two ways: Either:

对于将整个矩阵转换成数值,你有两种方法:

mode(X) <- "numeric"

or:

或者:

X <- apply(X, 2, as.numeric)

Alternatively you can use data.matrix function to convert everything into numeric, although be aware that the factors might not get converted correctly, so it is safer to convert everything to character first:

或者,您也可以使用数据。矩阵函数将一切转换为数字,尽管要知道这些因素可能无法正确转换,所以将一切转换为字符更安全:

X <- sapply(X, as.character)
X <- data.matrix(X)

I usually use this last one if I want to convert to matrix and numeric simultaneously

如果我想同时转换成矩阵和数值,我通常使用最后一个。

#7


8  

If you run into problems with:

如果你遇到问题:

as.numeric(as.character(dat$x))

Take a look to your decimal marks. If they are "," instead of "." (e.g. "5,3") the above won't work.

看一看你的十进制分数。如果他们是“,”而不是“。”(如。“5,3”)以上都不行。

A potential solution is:

一个潜在的解决方案是:

as.numeric(gsub(",", ".", dat$x))

I believe this is quite common in some non English speaking countries.

我相信这在一些非英语国家很常见。

#8


6  

Universal way using type.convert() and rapply():

使用type.convert()和rapply()的通用方法:

convert_types <- function(x) {
    stopifnot(is.list(x))
    x[] <- rapply(x, utils::type.convert, classes = "character",
                  how = "replace", as.is = TRUE)
    return(x)
}
d <- data.frame(char = letters[1:5], 
                fake_char = as.character(1:5), 
                fac = factor(1:5), 
                char_fac = factor(letters[1:5]), 
                num = 1:5, stringsAsFactors = FALSE)
sapply(d, class)
#>        char   fake_char         fac    char_fac         num 
#> "character" "character"    "factor"    "factor"   "integer"
sapply(convert_types(d), class)
#>        char   fake_char         fac    char_fac         num 
#> "character"   "integer"    "factor"    "factor"   "integer"

#9


4  

While your question is strictly on numeric, there are many conversions that are difficult to understand when beginning R. I'll aim to address methods to help. This question is similar to This Question.

虽然你的问题是严格的数字,但有很多的转换是很难理解的,当开始r。我将致力于解决方法来帮助。这个问题与这个问题相似。

Type conversion can be a pain in R because (1) factors can't be converted directly to numeric, they need to be converted to character class first, (2) dates are a special case that you typically need to deal with separately, and (3) looping across data frame columns can be tricky. Fortunately, the "tidyverse" has solved most of the issues.

类型转换在R中可能是一种痛苦,因为(1)因素不能直接转换为数字,它们需要首先转换成字符类,(2)日期是您通常需要单独处理的特殊情况,并且(3)跨数据框架列的循环可能会很棘手。幸运的是,“tidyverse”已经解决了大部分问题。

This solution uses mutate_each() to apply a function to all columns in a data frame. In this case, we want to apply the type.convert() function, which converts strings to numeric where it can. Because R loves factors (not sure why) character columns that should stay character get changed to factor. To fix this, the mutate_if() function is used to detect columns that are factors and change to character. Last, I wanted to show how lubridate can be used to change a timestamp in character class to date-time because this is also often a sticking block for beginners.

该解决方案使用mutate_each()将函数应用到数据框架中的所有列。在本例中,我们希望应用type.convert()函数,它可以将字符串转换为数值。因为R喜欢的因素(不确定为什么)字符列应该保持字符被更改为因子。为了解决这个问题,mutate_if()函数用于检测那些是影响因素并改变字符的列。最后,我想展示如何使用润滑来更改字符类中的时间戳,因为这通常也是初学者的一个难题。


library(tidyverse) 
library(lubridate)

# Recreate data that needs converted to numeric, date-time, etc
data_df
#> # A tibble: 5 × 9
#>             TIMESTAMP SYMBOL    EX  PRICE  SIZE  COND   BID BIDSIZ   OFR
#>                 <chr>  <chr> <chr>  <chr> <chr> <chr> <chr>  <chr> <chr>
#> 1 2012-05-04 09:30:00    BAC     T 7.8900 38538     F  7.89    523  7.90
#> 2 2012-05-04 09:30:01    BAC     Z 7.8850   288     @  7.88  61033  7.90
#> 3 2012-05-04 09:30:03    BAC     X 7.8900  1000     @  7.88   1974  7.89
#> 4 2012-05-04 09:30:07    BAC     T 7.8900 19052     F  7.88   1058  7.89
#> 5 2012-05-04 09:30:08    BAC     Y 7.8900 85053     F  7.88 108101  7.90

# Converting columns to numeric using "tidyverse"
data_df %>%
    mutate_each(funs(type.convert)) %>%
    mutate_if(is.factor, as.character) %>%
    mutate(TIMESTAMP = as_datetime(TIMESTAMP, tz = Sys.timezone()))
#> # A tibble: 5 × 9
#>             TIMESTAMP SYMBOL    EX PRICE  SIZE  COND   BID BIDSIZ   OFR
#>                <dttm>  <chr> <chr> <dbl> <int> <chr> <dbl>  <int> <dbl>
#> 1 2012-05-04 09:30:00    BAC     T 7.890 38538     F  7.89    523  7.90
#> 2 2012-05-04 09:30:01    BAC     Z 7.885   288     @  7.88  61033  7.90
#> 3 2012-05-04 09:30:03    BAC     X 7.890  1000     @  7.88   1974  7.89
#> 4 2012-05-04 09:30:07    BAC     T 7.890 19052     F  7.88   1058  7.89
#> 5 2012-05-04 09:30:08    BAC     Y 7.890 85053     F  7.88 108101  7.90

#10


3  

To convert a data frame column to numeric you just have to do:-

要将数据帧列转换成数字,您只需做:-。

factor to numeric:-

因素数值:-

data_frame$column <- as.numeric(as.character(data_frame$column))

#11


2  

Though others have covered the topic pretty well, I'd like to add this additional quick thought/hint. You could use regexp to check in advance whether characters potentially consist only of numerics.

虽然其他人已经很好地讨论了这个话题,但是我想补充一下这个额外的快速思考/暗示。您可以使用regexp提前检查字符是否仅包含数字。

for(i in seq_along(names(df)){
     potential_numcol[i] <- all(!grepl("[a-zA-Z]",d[,i]))
}
# and now just convert only the numeric ones
d <- sapply(d[,potential_numcol],as.numeric)

For more sophisticated regular expressions and a neat why to learn/experience their power see this really nice website: http://regexr.com/

对于更复杂的正则表达式,以及为什么要学习/体验他们的力量,请访问这个非常好的网站:http://regexr.com/。

#12


0  

In my PC (R v.3.2.3), apply or sapply give error. lapply works well.

在我的PC (R .3.2.3)中,应用或sapply给出错误。拉普兰人的作品。

dt[,2:4] <- lapply(dt[,2:4], function (x) as.factor(as.numeric(x)))

#13


0  

To convert character to numeric you have to convert it into factor by applying

要将字符转换为数值,您必须通过应用程序将其转换为因子。

BankFinal1 <- transform(BankLoan,   LoanApproval=as.factor(LoanApproval))
BankFinal1 <- transform(BankFinal1, LoanApp=as.factor(LoanApproval))

You have to make two columns with the same data, because one column cannot convert into numeric. If you do one conversion it gives the below error

您必须使用相同的数据生成两个列,因为一个列不能转换为数值。如果你做一个转换,它会给出下面的错误。

transform(BankData, LoanApp=as.numeric(LoanApproval))
Warning message:
  In eval(substitute(list(...)), `_data`, parent.frame()) :
  NAs introduced by coercion

so, after doing two column of the same data apply

所以,在做了两列相同的数据之后。

BankFinal1 < transform(BankFinal1, LoanApp      = as.numeric(LoanApp), 
                                   LoanApproval = as.numeric(LoanApproval))

it will transform the character to numeric successfully

它将成功地将字符转换为数值。

#14


0  

If the dataframe has multiple types of columns, some characters, some numeric try the following to convert just the columns that contain numeric values to numeric:

如果dataframe有多种类型的列,一些字符,一些数字尝试下面的内容来转换包含数字值的列:

for (i in 1:length(data[1,])){
  if(length(as.numeric(data[,i][!is.na(data[,i])])[!is.na(as.numeric(data[,i][!is.na(data[,i])]))])==0){}
  else {
    data[,i]<-as.numeric(data[,i])
  }
}

#15


0  

Considering there might exist char columns, this is based on @Abdou in Get column types of excel sheet automatically answer:

考虑到可能存在char列,这是基于@Abdou在获取列类型的excel表格中自动回答的:

makenumcols<-function(df){
df<-as.data.frame(df)
cond <- apply(df, 2, function(x) {
  x <- x[!is.na(x)]
  all(suppressWarnings(!is.na(as.numeric(x))))
})
numeric_cols <- names(df)[cond]
df[,numeric_cols] <- apply(df[,numeric_cols],2, as.character) # deals with factors
df[,numeric_cols] <- sapply(df[,numeric_cols], as.numeric)
return(df)
}
df<-makenumcols(df)