如何创建一个具有百分比刻度的固定箱宽度的直方图,反映每个方面?

时间:2022-05-05 13:31:13

I would like to create a histogram where the y-axis shows the percentage per facet in ggplot2. I have seen several similar questions but some answers seem outdated or they show the percentage of all observations rather than per facet.

我想创建一个直方图,其中y轴显示ggplot2中每个面的百分比。我已经看到了几个类似的问题,但是一些答案似乎已经过时,或者它们显示了所有观察的百分比,而不是每个方面。

I tried this:

我试过这个:

library(ggplot2)
library(scales)

ggplot(mtcars, aes(mpg))+
    facet_grid(cyl ~ am)+
    stat_count(aes(y=..prop..)) +
    theme_bw()+
    scale_y_continuous(labels = percent_format())

Which seems to work, except that the binwidth is not fixed. Facets with few observations have large bars.

这似乎有效,除了binwidth不固定。观察很少的方面有很大的条形。

How could I fix the binwidth?

我怎么能修复binwidth?

EDIT: Solution adapted from ACNB I overlooked that before and I just saw that Andrey Kolyadin was quicker to provide a more concise solution.

编辑:改编自ACNB的解决方案我之前忽略了这一点,我刚看到Andrey Kolyadin更快地提供更简洁的解决方案。

binwidth <- 1
mtcars.stats <- mtcars %>%
    group_by(cyl, am) %>%
    mutate(bin = cut(mpg, breaks=seq(0,35, binwidth), 
                     labels = seq(0 + binwidth, 35, binwidth)-(binwidth/2)),
           n = n()) %>%
    group_by(cyl, am, bin) %>%
    summarise(p = n()/n[1]) %>%
    ungroup() %>%
    mutate(bin = as.numeric(as.character(bin)))

ggplot(mtcars.stats, aes(x = bin, y= p)) + 
    geom_col() + 
    facet_grid(cyl~am)+
    theme_bw()+
    scale_y_continuous(labels = percent_format())

2 个解决方案

#1


3  

As alway I advice not to rely on statistics layer of ggplot2 and calculate necessary statistics before plotting:

总之,我建议不要依赖ggplot2的统计层并在绘图前计算必要的统计数据:

library('zoo')
library('tidyverse')

# Selecting breaks
breaks <- seq.int(min(mtcars$mpg), max(mtcars$mpg), length.out = 19)

# Calculating densities
mt_hist <- mtcars %>% 
  group_by(cyl, am) %>% 
  summarise(x = list(rollmean(breaks, 2)),
            count = list(hist(mpg, breaks = breaks, plot = FALSE)$counts)) %>% 
  unnest() %>% 
  group_by(cyl, am) %>% 
  mutate(count = count/sum(count))

And plot itself:

绘图本身:

ggplot(mt_hist)+
  aes(x = x,
      y = count)+
  geom_col()+
  facet_grid(cyl ~ am)+
  theme_bw()+
  scale_y_continuous(labels = percent_format())

如何创建一个具有百分比刻度的固定箱宽度的直方图,反映每个方面?

#2


1  

have you tried adding the geom_histogram and stat argument, something like ...

你有没有试过添加geom_histogram和stat参数,比如...

p <- ggplot(mtcars, aes(mpg))
p <- p + geom_histogram(stat = 'bin')
p <- p + facet_grid(cyl ~ am)
p <- p + stat_count(aes(y=..prop..))
p <- p +   theme_bw()
p <- p +   scale_y_continuous(labels = percent_format())
p

#1


3  

As alway I advice not to rely on statistics layer of ggplot2 and calculate necessary statistics before plotting:

总之,我建议不要依赖ggplot2的统计层并在绘图前计算必要的统计数据:

library('zoo')
library('tidyverse')

# Selecting breaks
breaks <- seq.int(min(mtcars$mpg), max(mtcars$mpg), length.out = 19)

# Calculating densities
mt_hist <- mtcars %>% 
  group_by(cyl, am) %>% 
  summarise(x = list(rollmean(breaks, 2)),
            count = list(hist(mpg, breaks = breaks, plot = FALSE)$counts)) %>% 
  unnest() %>% 
  group_by(cyl, am) %>% 
  mutate(count = count/sum(count))

And plot itself:

绘图本身:

ggplot(mt_hist)+
  aes(x = x,
      y = count)+
  geom_col()+
  facet_grid(cyl ~ am)+
  theme_bw()+
  scale_y_continuous(labels = percent_format())

如何创建一个具有百分比刻度的固定箱宽度的直方图,反映每个方面?

#2


1  

have you tried adding the geom_histogram and stat argument, something like ...

你有没有试过添加geom_histogram和stat参数,比如...

p <- ggplot(mtcars, aes(mpg))
p <- p + geom_histogram(stat = 'bin')
p <- p + facet_grid(cyl ~ am)
p <- p + stat_count(aes(y=..prop..))
p <- p +   theme_bw()
p <- p +   scale_y_continuous(labels = percent_format())
p