如何使用R中的ggplot/ _bar从数据集上添加自定义标签?

时间:2021-11-14 13:30:30

I have the attached datasets and use this R code to plot the data:

我有附件的数据集,使用这个R代码绘制数据:

plotData <- read.csv("plotdata.csv")
ix <- 1:nrow(plotData)
long <- melt(transform(plotData, id = ix), id = "id") # add id col; melt to long form
ggp2 <- ggplot(long, aes(id, value, fill = variable))+geom_bar(stat = "identity", position = "dodge")+
    scale_x_continuous(breaks = ix) +
    labs(y='Throughput (Mbps)',x='Nodes') +
    scale_fill_discrete(name="Legend",
                        labels=c("Inside Firewall (Dest)",
                                 "Inside Firewall (Source)",
                                 "Outside Firewall (Dest)",
                                 "Outside Firewall (Source)")) +
    theme(legend.position="right") +  # The position of the legend
    theme(legend.title = element_text(colour="blue", size=14, face="bold")) + # Title appearance
    theme(legend.text = element_text(colour="blue", size = 12, face = "bold")) # Label appearance
plot(ggp2)

The resulting plot is attached as well.

结果的图也附在一起。

Now I need to add numbers from different datasets on top of each bar. For example:

现在我需要在每个栏的顶部添加来自不同数据集的数字。例如:

  1. on top of "Inside Firewall (Dest)" should be the numbers from sampleNumIFdest.csv
  2. 在“防火墙内部(Dest)”上面应该是来自samplenumidest.csv的数字
  3. on top of "Inside Firewall (Source)" should be the numbers from sampleNumIFsource.csv
  4. 在“内部防火墙(Source)”的顶部应该是来自sampleNumIFsource.csv的数据。
  5. on top of "Outside Firewall (Dest)" should be the numbers from sampleNumOFdest.csv
  6. 在“外部防火墙(Dest)”上面应该是来自sampleNumOFdest.csv的数字
  7. on top of "Outside Firewall (Source)" should be the numbers from sampleNumOFsource.csv
  8. 在“外部防火墙(源)”之上应该是来自sampleNumOFsource.csv的数字

I have tried to use geom_text() but I do not know how to read the numbers from the different datasets. Please note, that the datasets have different number of rows (which causes additional problems for me). Any suggestion is highly appreciated.

我尝试过使用geom_text(),但是我不知道如何从不同的数据集中读取数字。请注意,这些数据集有不同的行数(这给我带来了额外的问题)。非常感谢您的建议。

The attached files are here.

附件文件在这里。

Sorry, I had to zip all my files as I am not allowed to add more then 2 URLs in my post.

对不起,我必须压缩所有文件,因为我不允许在我的帖子中添加超过两个url。

1 个解决方案

#1


4  

I think the best solution is to combine all the datasets into one:

我认为最好的解决办法是将所有的数据集合并为一个:

# loading the different datasets
plotData <- read.csv("plotData.csv")
IFdest <- read.table("sampleNumIFdest.csv", sep="\t", header=TRUE, strip.white=TRUE)
IFsource <- read.table("sampleNumIFsource.csv", sep="\t", header=TRUE, strip.white=TRUE)
OFdest <- read.table("sampleNumOFdest.csv", sep="\t", header=TRUE, strip.white=TRUE)
OFsource <- read.table("sampleNumOFsource.csv", sep="\t", header=TRUE, strip.white=TRUE)

# add an id
ix <- 1:nrow(plotData)
plotData$id <- 1:nrow(plotData)
plotData <- plotData[,c(5,1,2,3,4)]

# combine the different dataframe
plotData$IFdest <- c(IFdest$Freq, NA)
plotData$IFsource <- c(IFsource$Freq, NA, NA)
plotData$OFdest <- OFdest$Freq
plotData$OFsource <- c(OFsource$Freq, NA, NA)

# reshape the dataframe
long <- cbind(
  melt(plotData, id = c("id"), measure = c(2:5),
       variable = "type", value.name = "value"),
  melt(plotData, id = c("id"), measure = c(6:9),
       variable = "name", value.name = "numbers")
)
long <- long[,-c(4,5)] # this removes two unneceassary columns

When you have done that, you can use geom_text to plot the numbers on top of the bars:

当你这样做的时候,你可以使用geom_text在条的顶部绘制数字:

# create your plot
ggplot(long, aes(x = id, y = value, fill = type)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label = numbers), vjust=-1, position = position_dodge(0.9), size = 3) +
  scale_x_continuous(breaks = ix) +
  labs(x = "Nodes", y = "Throughput (Mbps)") +
  scale_fill_discrete(name="Legend",
                      labels=c("Inside Firewall (Dest)",
                               "Inside Firewall (Source)",
                               "Outside Firewall (Dest)",
                               "Outside Firewall (Source)")) +
  theme_bw() +
  theme(legend.position="right") +
  theme(legend.title = element_text(colour="blue", size=14, face="bold")) + 
  theme(legend.text = element_text(colour="blue", size=12, face="bold"))

The result: 如何使用R中的ggplot/ _bar从数据集上添加自定义标签?

结果:

As you can see, the text labels overlap sometimes. You can change that by decreasing the size of the text, but then you run the risk that the labels become hard to read. You might therefore consider to use facets by adding facet_grid(type ~ .) (or facet_wrap(~ type)) to the plotting code:

如您所见,文本标签有时会重叠。您可以通过减少文本的大小来改变这一点,但是这样做会带来标签变得难以阅读的风险。因此,您可以考虑通过在绘图代码中添加facet_grid(类型为~)(或facet_wrap(类型为~类型))来使用facet_grid:

ggplot(long, aes(x = id, y = value, fill = type)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label = numbers), vjust=-0.5, position = position_dodge(0.9), size = 3) +
  scale_x_continuous("Nodes", breaks = ix) +
  scale_y_continuous("Throughput (Mbps)", limits = c(0,1000)) +
  scale_fill_discrete(name="Legend",
                      labels=c("Inside Firewall (Dest)",
                               "Inside Firewall (Source)",
                               "Outside Firewall (Dest)",
                               "Outside Firewall (Source)")) +
  theme_bw() +
  theme(legend.position="right") +
  theme(legend.title = element_text(colour="blue", size=14, face="bold")) + 
  theme(legend.text = element_text(colour="blue", size=12, face="bold")) +
  facet_grid(type ~ .)

which results in the following plot:

其结果如下图所示:

如何使用R中的ggplot/ _bar从数据集上添加自定义标签?

#1


4  

I think the best solution is to combine all the datasets into one:

我认为最好的解决办法是将所有的数据集合并为一个:

# loading the different datasets
plotData <- read.csv("plotData.csv")
IFdest <- read.table("sampleNumIFdest.csv", sep="\t", header=TRUE, strip.white=TRUE)
IFsource <- read.table("sampleNumIFsource.csv", sep="\t", header=TRUE, strip.white=TRUE)
OFdest <- read.table("sampleNumOFdest.csv", sep="\t", header=TRUE, strip.white=TRUE)
OFsource <- read.table("sampleNumOFsource.csv", sep="\t", header=TRUE, strip.white=TRUE)

# add an id
ix <- 1:nrow(plotData)
plotData$id <- 1:nrow(plotData)
plotData <- plotData[,c(5,1,2,3,4)]

# combine the different dataframe
plotData$IFdest <- c(IFdest$Freq, NA)
plotData$IFsource <- c(IFsource$Freq, NA, NA)
plotData$OFdest <- OFdest$Freq
plotData$OFsource <- c(OFsource$Freq, NA, NA)

# reshape the dataframe
long <- cbind(
  melt(plotData, id = c("id"), measure = c(2:5),
       variable = "type", value.name = "value"),
  melt(plotData, id = c("id"), measure = c(6:9),
       variable = "name", value.name = "numbers")
)
long <- long[,-c(4,5)] # this removes two unneceassary columns

When you have done that, you can use geom_text to plot the numbers on top of the bars:

当你这样做的时候,你可以使用geom_text在条的顶部绘制数字:

# create your plot
ggplot(long, aes(x = id, y = value, fill = type)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label = numbers), vjust=-1, position = position_dodge(0.9), size = 3) +
  scale_x_continuous(breaks = ix) +
  labs(x = "Nodes", y = "Throughput (Mbps)") +
  scale_fill_discrete(name="Legend",
                      labels=c("Inside Firewall (Dest)",
                               "Inside Firewall (Source)",
                               "Outside Firewall (Dest)",
                               "Outside Firewall (Source)")) +
  theme_bw() +
  theme(legend.position="right") +
  theme(legend.title = element_text(colour="blue", size=14, face="bold")) + 
  theme(legend.text = element_text(colour="blue", size=12, face="bold"))

The result: 如何使用R中的ggplot/ _bar从数据集上添加自定义标签?

结果:

As you can see, the text labels overlap sometimes. You can change that by decreasing the size of the text, but then you run the risk that the labels become hard to read. You might therefore consider to use facets by adding facet_grid(type ~ .) (or facet_wrap(~ type)) to the plotting code:

如您所见,文本标签有时会重叠。您可以通过减少文本的大小来改变这一点,但是这样做会带来标签变得难以阅读的风险。因此,您可以考虑通过在绘图代码中添加facet_grid(类型为~)(或facet_wrap(类型为~类型))来使用facet_grid:

ggplot(long, aes(x = id, y = value, fill = type)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label = numbers), vjust=-0.5, position = position_dodge(0.9), size = 3) +
  scale_x_continuous("Nodes", breaks = ix) +
  scale_y_continuous("Throughput (Mbps)", limits = c(0,1000)) +
  scale_fill_discrete(name="Legend",
                      labels=c("Inside Firewall (Dest)",
                               "Inside Firewall (Source)",
                               "Outside Firewall (Dest)",
                               "Outside Firewall (Source)")) +
  theme_bw() +
  theme(legend.position="right") +
  theme(legend.title = element_text(colour="blue", size=14, face="bold")) + 
  theme(legend.text = element_text(colour="blue", size=12, face="bold")) +
  facet_grid(type ~ .)

which results in the following plot:

其结果如下图所示:

如何使用R中的ggplot/ _bar从数据集上添加自定义标签?