I have the following function to describe a variable
我有以下函数来描述变量
library(dplyr)
describe = function(.data, variable){
args <- as.list(match.call())
evalue = eval(args$variable, .data)
summarise(.data,
'n'= length(evalue),
'mean' = mean(evalue),
'sd' = sd(evalue))
}
I want to use dplyr
for describing the variable.
我想用dplyr来描述变量。
set.seed(1)
df = data.frame(
'g' = sample(1:3, 100, replace=T),
'x1' = rnorm(100),
'x2' = rnorm(100)
)
df %>% describe(x1)
# n mean sd
# 1 100 -0.01757949 0.9400179
The problem is that when I try to apply the same descrptive using function group_by
the describe function is not applied in each group
问题是,当我尝试使用函数group_by应用相同的descrptive时,describe函数不会应用于每个组
df %>% group_by(g) %>% describe(x1)
# # A tibble: 3 x 4
# g n mean sd
# <int> <int> <dbl> <dbl>
# 1 1 100 -0.01757949 0.9400179
# 2 2 100 -0.01757949 0.9400179
# 3 3 100 -0.01757949 0.9400179
How would you change the function to obtain what is desired using an small number of modifications?
如何使用少量修改来更改功能以获得所需的功能?
2 个解决方案
#1
7
You need tidyeval:
你需要tidyeval:
describe = function(.data, variable){
evalue = enquo(variable)
summarise(.data,
'n'= length(!!evalue),
'mean' = mean(!!evalue),
'sd' = sd(!!evalue))
}
df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
g n mean sd
<int> <int> <dbl> <dbl>
1 1 27 -0.23852862 1.0597510
2 2 38 0.11327236 0.8470885
3 3 35 0.01079926 0.9351509
The dplyr vignette 'Programming with dplyr' has a thorough description of using enquo
and !!
dplyr插图'用dplyr编程'有一个使用enquo和!!的详尽描述。
Edit:
In response to Axeman's comment, I'm not 100% why the group_by and describe does not work here. However, using debugonce
with the funciton in it's original form
为了回应Axeman的评论,我不是100%为什么group_by和describe在这里不起作用。但是,使用带有funciton的debugonce的原始形式
debugonce(describe)
df %>% group_by(g) %>% describe(x1)
one can see that evalue
is not grouped and is just a numeric vector of length 100.
我们可以看到evalue没有分组,只是一个长度为100的数字向量。
#2
0
Base NSE appears to work, too:
基础NSE似乎也有效:
describe <- function(data, var){
var_q <- substitute(var)
data %>%
summarise(n = n(),
mean = mean(eval(var_q)),
sd = sd(eval(var_q)))
}
df %>% describe(x1)
n mean sd
1 100 -0.1266289 1.006795
df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
g n mean sd
<int> <int> <dbl> <dbl>
1 1 33 -0.1379206 1.107412
2 2 29 -0.4869704 0.748735
3 3 38 0.1581745 1.020831
#1
7
You need tidyeval:
你需要tidyeval:
describe = function(.data, variable){
evalue = enquo(variable)
summarise(.data,
'n'= length(!!evalue),
'mean' = mean(!!evalue),
'sd' = sd(!!evalue))
}
df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
g n mean sd
<int> <int> <dbl> <dbl>
1 1 27 -0.23852862 1.0597510
2 2 38 0.11327236 0.8470885
3 3 35 0.01079926 0.9351509
The dplyr vignette 'Programming with dplyr' has a thorough description of using enquo
and !!
dplyr插图'用dplyr编程'有一个使用enquo和!!的详尽描述。
Edit:
In response to Axeman's comment, I'm not 100% why the group_by and describe does not work here. However, using debugonce
with the funciton in it's original form
为了回应Axeman的评论,我不是100%为什么group_by和describe在这里不起作用。但是,使用带有funciton的debugonce的原始形式
debugonce(describe)
df %>% group_by(g) %>% describe(x1)
one can see that evalue
is not grouped and is just a numeric vector of length 100.
我们可以看到evalue没有分组,只是一个长度为100的数字向量。
#2
0
Base NSE appears to work, too:
基础NSE似乎也有效:
describe <- function(data, var){
var_q <- substitute(var)
data %>%
summarise(n = n(),
mean = mean(eval(var_q)),
sd = sd(eval(var_q)))
}
df %>% describe(x1)
n mean sd
1 100 -0.1266289 1.006795
df %>% group_by(g) %>% describe(x1)
# A tibble: 3 x 4
g n mean sd
<int> <int> <dbl> <dbl>
1 1 33 -0.1379206 1.107412
2 2 29 -0.4869704 0.748735
3 3 38 0.1581745 1.020831