将参数传递给使用dplyr的函数

时间:2021-06-18 20:36:31

I have the following function to describe a variable

我有以下函数来描述变量

library(dplyr)
describe = function(.data, variable){
  args <- as.list(match.call())
  evalue = eval(args$variable, .data)
  summarise(.data,
            'n'= length(evalue),
            'mean' = mean(evalue),
            'sd' = sd(evalue))
}

I want to use dplyr for describing the variable.

我想用dplyr来描述变量。

set.seed(1)
df = data.frame(
  'g' = sample(1:3, 100, replace=T),
  'x1' = rnorm(100),
  'x2' = rnorm(100)
)
df %>% describe(x1)
#     n        mean        sd
# 1 100 -0.01757949 0.9400179

The problem is that when I try to apply the same descrptive using function group_by the describe function is not applied in each group

问题是,当我尝试使用函数group_by应用相同的descrptive时,describe函数不会应用于每个组

df %>% group_by(g) %>% describe(x1)
# # A tibble: 3 x 4
#       g     n        mean        sd
#   <int> <int>       <dbl>     <dbl>
# 1     1   100 -0.01757949 0.9400179
# 2     2   100 -0.01757949 0.9400179
# 3     3   100 -0.01757949 0.9400179

How would you change the function to obtain what is desired using an small number of modifications?

如何使用少量修改来更改功能以获得所需的功能?

2 个解决方案

#1


7  

You need tidyeval:

你需要tidyeval:

describe = function(.data, variable){
  evalue = enquo(variable)
  summarise(.data,
            'n'= length(!!evalue),
            'mean' = mean(!!evalue),
            'sd' = sd(!!evalue))
}

df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
      g     n        mean        sd
  <int> <int>       <dbl>     <dbl>
1     1    27 -0.23852862 1.0597510
2     2    38  0.11327236 0.8470885
3     3    35  0.01079926 0.9351509

The dplyr vignette 'Programming with dplyr' has a thorough description of using enquo and !!

dplyr插图'用dplyr编程'有一个使用enquo和!!的详尽描述。

Edit:

In response to Axeman's comment, I'm not 100% why the group_by and describe does not work here. However, using debugonce with the funciton in it's original form

为了回应Axeman的评论,我不是100%为什么group_by和describe在这里不起作用。但是,使用带有funciton的debugonce的原始形式

debugonce(describe)

df %>% group_by(g) %>% describe(x1)

one can see that evalue is not grouped and is just a numeric vector of length 100.

我们可以看到evalue没有分组,只是一个长度为100的数字向量。

#2


0  

Base NSE appears to work, too:

基础NSE似乎也有效:

describe <- function(data, var){

  var_q <- substitute(var)
  data %>% 
    summarise(n = n(),
              mean = mean(eval(var_q)),
              sd = sd(eval(var_q)))
}


df %>% describe(x1) 

   n       mean       sd
1 100 -0.1266289 1.006795



df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
      g     n       mean       sd
  <int> <int>      <dbl>    <dbl>
1     1    33 -0.1379206 1.107412
2     2    29 -0.4869704 0.748735
3     3    38  0.1581745 1.020831

#1


7  

You need tidyeval:

你需要tidyeval:

describe = function(.data, variable){
  evalue = enquo(variable)
  summarise(.data,
            'n'= length(!!evalue),
            'mean' = mean(!!evalue),
            'sd' = sd(!!evalue))
}

df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
      g     n        mean        sd
  <int> <int>       <dbl>     <dbl>
1     1    27 -0.23852862 1.0597510
2     2    38  0.11327236 0.8470885
3     3    35  0.01079926 0.9351509

The dplyr vignette 'Programming with dplyr' has a thorough description of using enquo and !!

dplyr插图'用dplyr编程'有一个使用enquo和!!的详尽描述。

Edit:

In response to Axeman's comment, I'm not 100% why the group_by and describe does not work here. However, using debugonce with the funciton in it's original form

为了回应Axeman的评论,我不是100%为什么group_by和describe在这里不起作用。但是,使用带有funciton的debugonce的原始形式

debugonce(describe)

df %>% group_by(g) %>% describe(x1)

one can see that evalue is not grouped and is just a numeric vector of length 100.

我们可以看到evalue没有分组,只是一个长度为100的数字向量。

#2


0  

Base NSE appears to work, too:

基础NSE似乎也有效:

describe <- function(data, var){

  var_q <- substitute(var)
  data %>% 
    summarise(n = n(),
              mean = mean(eval(var_q)),
              sd = sd(eval(var_q)))
}


df %>% describe(x1) 

   n       mean       sd
1 100 -0.1266289 1.006795



df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
      g     n       mean       sd
  <int> <int>      <dbl>    <dbl>
1     1    33 -0.1379206 1.107412
2     2    29 -0.4869704 0.748735
3     3    38  0.1581745 1.020831

相关文章