ggplot2:从x轴日期删除周末和假日的空格

时间:2021-08-10 08:56:17

running into issues while plotting stock data in ggplot2 and with an x-axis that contains gaps from weekends and holidays. this post has been very helpful, but i run into a variety of issues when trying to use ordered factors.

在ggplot2中绘制库存数据时遇到问题,并且x轴包含周末和假日的间隙。这篇文章非常有帮助,但在尝试使用有序因子时遇到了各种各样的问题。

library(xts)
library(grid)
library(dplyr)
library(scales)
library(bdscale)
library(ggplot2)
library(quantmod)

getSymbols("SPY", from = Sys.Date() - 1460, to = Sys.Date(), adjust = TRUE, auto.assign = TRUE)

input <- data.frame(SPY["2015/"])
names(input) <- c("Open", "High", "Low", "Close", "Volume", "Adjusted")

# i've tried changing rownames() to index(), and the plot looks good, but the x-axis is inaccurate
# i've also tried as.factor()
xaxis <- as.Date(rownames(input)) 
input$xaxis <- xaxis

p <- ggplot(input)
p <- p + geom_segment(aes(x = xaxis, xend = xaxis, y = Low, yend = High), size = 0.50)           # body
p <- p + geom_segment(aes(x = xaxis - 0.4, xend = xaxis, y = Open, yend = Open), size = 0.90)    # open
p <- p + geom_segment(aes(x = xaxis, xend = xaxis + 0.4, y = Close, yend = Close), size = 0.90)  # close
p <- p + scale_y_continuous(scale_y_log10())
p + ggtitle("SPY: 2015")

ggplot2:从x轴日期删除周末和假日的空格

The plot above (sans red boxes) is generated with the above code segment. And the following charts are some of the issues when attempting some solutions. First, if I try using the data frame's index, I will generate I nice looking graph, but the x-axis is inaccurate; the data currently ends in October, but in the plot below it ends in July: ggplot2:从x轴日期删除周末和假日的空格

上面的图(无框红色框)是使用上面的代码段生成的。以下图表是尝试某些解决方案时的一些问题。首先,如果我尝试使用数据框的索引,我将生成一个漂亮的图形,但x轴是不准确的;数据目前在10月结束,但在下面的图中,它将在7月结束:

xaxis <- as.Date(index(input))

Second, if I try coercing the rownames to an ordered factor, I lose my horizontal tick data (representing the open and the close).
ggplot2:从x轴日期删除周末和假日的空格

其次,如果我尝试将rownames强制转换为有序因子,我会丢失水平刻度数据(表示打开和关闭)。

xaxis <- factor(rownames(input), ordered = TRUE) 

The same issue of removing the horizontal ticks happens if I use the package bdscale, but the gridlines are cleaner:

如果我使用包bdscale,则会发生删除水平刻度的相同问题,但网格线更清晰:

ggplot2:从x轴日期删除周末和假日的空格

p <- p + scale_x_bd(business.dates = xaxis)

5 个解决方案

#1


2  

You'll probably need to treat the dates as discrete values rather than continuous. This approach with a slightly simplified version of your code might look like:

您可能需要将日期视为离散值而不是连续值。这种方法略微简化了代码版本,可能如下所示:

getSymbols("SPY", from = Sys.Date() - 1460, to = Sys.Date(), adjust = TRUE, auto.assign = TRUE)
SPY <- SPY["2015/"]
colnames(SPY) <- sub("SPY.","", colnames(SPY))
month_brks <- c(1,endpoints(SPY, "months")[-1])

p <- ggplot(data.frame(xaxis=seq(nrow(SPY)), SPY))
p <- p + geom_linerange(aes(x=xaxis, ymin=Low, ymax=High), size=.5)
p <- p + geom_text(aes(x = xaxis,  y = Open), size = 4., label="-", hjust=.7, vjust=0)  # Open
p <- p + geom_text(aes(x = xaxis,  y = Close), size = 4., label="-", hjust=-.1, vjust=0)  # close
p <- p + scale_x_continuous(breaks=month_brks, labels=format(index(SPY)[month_brks], "%d %b %Y"))
p <- p + labs(title="SPY: 2015", x="Date", y="Price")

UPDATE

Updated treatment of axis labels.

更新轴标签的处理。

#2


1  

If you'd like to use bdscale for this, just tell it to use more gridlines:

如果您想使用bdscale,只需告诉它使用更多网格线:

ggplot(input) + 
  geom_segment(aes(x = xaxis, xend = xaxis, y = Low, yend = High), size = 0.50) +           # body 
  geom_segment(aes(x = xaxis - 0.4, xend = xaxis, y = Open, yend = Open), size = 0.90) +    # open
  geom_segment(aes(x = xaxis, xend = xaxis + 0.4, y = Close, yend = Close), size = 0.90)  + # close
  ggtitle("SPY: 2015") +
  xlab('') + ylab('') +
  scale_x_bd(business.dates=xaxis, max.major.breaks=10, labels=date_format("%b '%y")) # <==== !!!!

ggplot2:从x轴日期删除周末和假日的空格

It should put October on the axis there, but it's not that smart. Womp womp. Pull requests welcome!

它应该把十月放在那里的轴上,但它并不那么聪明。 Womp womp。拉请求欢迎!

#3


1  

Well, you can tweak it manually, but it's kind of hacky. First, you should use index, so that your observations are numbered 1 to 188.

好吧,你可以手动调整它,但它有点hacky。首先,您应该使用索引,以便您的观察编号为1到188。

   input$xaxis <-index(as.Date(rownames(input)))

Then your own plot code:

然后你自己的情节代码:

p <- ggplot(input)
p <- p + geom_segment(aes(x = xaxis, xend = xaxis, y = Low, yend = High), size = 0.50)           # body
p <- p + geom_segment(aes(x = xaxis - 0.4, xend = xaxis, y = Open, yend = Open), size = 0.90)    # open
p <- p + geom_segment(aes(x = xaxis, xend = xaxis + 0.4, y = Close, yend = Close), size = 0.90)  # close
p <- p + scale_y_continuous(scale_y_log10()) + ggtitle("SPY: 2015") 

And finally, I looked in input where the breaks should be made, and supplied the labels manually:

最后,我查看了输入应该在哪里打破,并手动提供标签:

p + scale_x_continuous(breaks=input$xaxis[c(1,62,125,188)], labels=c("jan","apr","jul","oct"))

NOTE HERE that I was lazy and just took the closest date for 1-jan, 1-apr 1-jul and 1-oct, because 1 jan is a holiday, the label "jan" stands below 2-jan. And I put the label "oct" below below 30-sep, the last entry in input. You can off course adjust this as you wish.

请注意,我是懒惰的,只花了1-jan,1-apr 1-jul和1-oct的最近日期,因为1 jan是假日,标签“jan”低于2-jan。我把标签“oct”放在30-sep以下,输入的最后一个条目。您可以根据需要对此进行调整。

Off course, you could automate the label add a label field with date and extract the month.

当然,您可以自动化标签添加带有日期的标签字段并提取月份。

#4


1  

The method below uses faceting to remove spaces between missing dates, then removes white space between facets to recover the look of an unfaceted plot.

下面的方法使用分面来删除缺失日期之间的空格,然后删除构面之间的空白区域以恢复未破坏的绘图的外观。

First, we create a grouping variable that increments each time there's a break in the dates (code adapted from this SO answer). We'll use this later for faceting.

首先,我们创建一个分组变量,每当日期中断时(从此SO答案改编的代码),该变量会递增。我们稍后会将其用于分面。

input$group = c(0, cumsum(diff(input$xaxis) > 1))

Now we add the following code to your plot. facet_grid creates a new facet at each location where there was a break in the date sequence due to a weekend or holiday. scale_x_date adds major tick marks once per week and minor grid lines for each day, but you can adjust this. The theme function gets rid of the facet strip labels and the vertical spaces between facets:

现在我们将以下代码添加到您的图中。 facet_grid会在每个位置创建一个新构面,因为周末或假日会导致日期序列中断。 scale_x_date每周添加一次主要刻度线,每天添加次要网格线,但您可以调整此值。主题功能摆脱了小面条标签和小平面之间的垂直空间:

p + facet_grid(. ~ group, space="free_x", scales="free_x") +
  scale_x_date(breaks=seq(as.Date("2015-01-01"),max(input$xaxis), "1 week"), 
               minor_breaks="1 day",
               labels=date_format("%b %d, %Y")) +
  theme(axis.text.x=element_text(angle=-90, hjust=0.5, vjust=0.5, size=11),
        panel.margin = unit(-0.05, "lines"),
        strip.text=element_text(size=0),
        strip.background=element_rect(fill=NA)) +
  ggtitle("SPY: 2015")

Here's the resulting plot. The spaces for weekends and holidays are gone. The major breaks mark each week. I set the weeks in thescale_x_date breaks argument to start on a Thursday since none of the holidays fell on a Thursday and therefore each facet has a major tick mark for the date. (In contrast, the default breaks would fall on a Monday. Since holidays often fall on a Monday, weeks with Monday holidays would not have a major tick mark with the default breaks.) Note, however, that the spacing between the major breaks inherently varies based on how many days the market was open that week.

这是结果图。周末和假期的空间消失了。每周都会有重大突破。我将thescale_x_date break参数中的周数设置为在星期四开始,因为星期四没有假期,所以每个方面都有一个主要的刻度标记。 (相比之下,默认休息时间会在星期一进行。由于假期通常在星期一,星期一假日的周数与默认休息时间没有重要的勾号。)但是,请注意,主要休息时间间隔固有根据当周市场开放的天数而变化。

ggplot2:从x轴日期删除周末和假日的空格

#5


1  

Haven't been able to get the OHLC to work - think you'd need a custom geom.

无法让OHLC工作 - 认为你需要一个自定义的geom。

I know it isn't exactly what you asked for, but may I tempt you with a delicious candle chart instead?

我知道这不是你要求的,但我可以用美味的蜡烛图来诱惑你吗?

library(dplyr)
library(bdscale)
library(ggplot2)
library(quantmod)
library(magrittr)
library(scales)

getSymbols("SPY", from = Sys.Date() - 1460, to = Sys.Date(), adjust = TRUE, auto.assign = TRUE)

input <- data.frame(SPY["2015/"]) %>% 
  set_names(c("open", "high", "low", "close", "volume", "adjusted")) %>%
  mutate(date=as.Date(rownames(.)))

input %>% ggplot(aes(x=date, ymin=low, ymax=high, lower=pmin(open,close), upper=pmax(open,close), 
                     fill=open<close, group=date, middle=pmin(open,close))) + 
  geom_boxplot(stat='identity') +
  ggtitle("SPY: 2015") +
  xlab('') + ylab('') + theme(legend.position='none') +
  scale_x_bd(business.dates=input$date, max.major.breaks=10, labels=date_format("%b '%y"))

ggplot2:从x轴日期删除周末和假日的空格

#1


2  

You'll probably need to treat the dates as discrete values rather than continuous. This approach with a slightly simplified version of your code might look like:

您可能需要将日期视为离散值而不是连续值。这种方法略微简化了代码版本,可能如下所示:

getSymbols("SPY", from = Sys.Date() - 1460, to = Sys.Date(), adjust = TRUE, auto.assign = TRUE)
SPY <- SPY["2015/"]
colnames(SPY) <- sub("SPY.","", colnames(SPY))
month_brks <- c(1,endpoints(SPY, "months")[-1])

p <- ggplot(data.frame(xaxis=seq(nrow(SPY)), SPY))
p <- p + geom_linerange(aes(x=xaxis, ymin=Low, ymax=High), size=.5)
p <- p + geom_text(aes(x = xaxis,  y = Open), size = 4., label="-", hjust=.7, vjust=0)  # Open
p <- p + geom_text(aes(x = xaxis,  y = Close), size = 4., label="-", hjust=-.1, vjust=0)  # close
p <- p + scale_x_continuous(breaks=month_brks, labels=format(index(SPY)[month_brks], "%d %b %Y"))
p <- p + labs(title="SPY: 2015", x="Date", y="Price")

UPDATE

Updated treatment of axis labels.

更新轴标签的处理。

#2


1  

If you'd like to use bdscale for this, just tell it to use more gridlines:

如果您想使用bdscale,只需告诉它使用更多网格线:

ggplot(input) + 
  geom_segment(aes(x = xaxis, xend = xaxis, y = Low, yend = High), size = 0.50) +           # body 
  geom_segment(aes(x = xaxis - 0.4, xend = xaxis, y = Open, yend = Open), size = 0.90) +    # open
  geom_segment(aes(x = xaxis, xend = xaxis + 0.4, y = Close, yend = Close), size = 0.90)  + # close
  ggtitle("SPY: 2015") +
  xlab('') + ylab('') +
  scale_x_bd(business.dates=xaxis, max.major.breaks=10, labels=date_format("%b '%y")) # <==== !!!!

ggplot2:从x轴日期删除周末和假日的空格

It should put October on the axis there, but it's not that smart. Womp womp. Pull requests welcome!

它应该把十月放在那里的轴上,但它并不那么聪明。 Womp womp。拉请求欢迎!

#3


1  

Well, you can tweak it manually, but it's kind of hacky. First, you should use index, so that your observations are numbered 1 to 188.

好吧,你可以手动调整它,但它有点hacky。首先,您应该使用索引,以便您的观察编号为1到188。

   input$xaxis <-index(as.Date(rownames(input)))

Then your own plot code:

然后你自己的情节代码:

p <- ggplot(input)
p <- p + geom_segment(aes(x = xaxis, xend = xaxis, y = Low, yend = High), size = 0.50)           # body
p <- p + geom_segment(aes(x = xaxis - 0.4, xend = xaxis, y = Open, yend = Open), size = 0.90)    # open
p <- p + geom_segment(aes(x = xaxis, xend = xaxis + 0.4, y = Close, yend = Close), size = 0.90)  # close
p <- p + scale_y_continuous(scale_y_log10()) + ggtitle("SPY: 2015") 

And finally, I looked in input where the breaks should be made, and supplied the labels manually:

最后,我查看了输入应该在哪里打破,并手动提供标签:

p + scale_x_continuous(breaks=input$xaxis[c(1,62,125,188)], labels=c("jan","apr","jul","oct"))

NOTE HERE that I was lazy and just took the closest date for 1-jan, 1-apr 1-jul and 1-oct, because 1 jan is a holiday, the label "jan" stands below 2-jan. And I put the label "oct" below below 30-sep, the last entry in input. You can off course adjust this as you wish.

请注意,我是懒惰的,只花了1-jan,1-apr 1-jul和1-oct的最近日期,因为1 jan是假日,标签“jan”低于2-jan。我把标签“oct”放在30-sep以下,输入的最后一个条目。您可以根据需要对此进行调整。

Off course, you could automate the label add a label field with date and extract the month.

当然,您可以自动化标签添加带有日期的标签字段并提取月份。

#4


1  

The method below uses faceting to remove spaces between missing dates, then removes white space between facets to recover the look of an unfaceted plot.

下面的方法使用分面来删除缺失日期之间的空格,然后删除构面之间的空白区域以恢复未破坏的绘图的外观。

First, we create a grouping variable that increments each time there's a break in the dates (code adapted from this SO answer). We'll use this later for faceting.

首先,我们创建一个分组变量,每当日期中断时(从此SO答案改编的代码),该变量会递增。我们稍后会将其用于分面。

input$group = c(0, cumsum(diff(input$xaxis) > 1))

Now we add the following code to your plot. facet_grid creates a new facet at each location where there was a break in the date sequence due to a weekend or holiday. scale_x_date adds major tick marks once per week and minor grid lines for each day, but you can adjust this. The theme function gets rid of the facet strip labels and the vertical spaces between facets:

现在我们将以下代码添加到您的图中。 facet_grid会在每个位置创建一个新构面,因为周末或假日会导致日期序列中断。 scale_x_date每周添加一次主要刻度线,每天添加次要网格线,但您可以调整此值。主题功能摆脱了小面条标签和小平面之间的垂直空间:

p + facet_grid(. ~ group, space="free_x", scales="free_x") +
  scale_x_date(breaks=seq(as.Date("2015-01-01"),max(input$xaxis), "1 week"), 
               minor_breaks="1 day",
               labels=date_format("%b %d, %Y")) +
  theme(axis.text.x=element_text(angle=-90, hjust=0.5, vjust=0.5, size=11),
        panel.margin = unit(-0.05, "lines"),
        strip.text=element_text(size=0),
        strip.background=element_rect(fill=NA)) +
  ggtitle("SPY: 2015")

Here's the resulting plot. The spaces for weekends and holidays are gone. The major breaks mark each week. I set the weeks in thescale_x_date breaks argument to start on a Thursday since none of the holidays fell on a Thursday and therefore each facet has a major tick mark for the date. (In contrast, the default breaks would fall on a Monday. Since holidays often fall on a Monday, weeks with Monday holidays would not have a major tick mark with the default breaks.) Note, however, that the spacing between the major breaks inherently varies based on how many days the market was open that week.

这是结果图。周末和假期的空间消失了。每周都会有重大突破。我将thescale_x_date break参数中的周数设置为在星期四开始,因为星期四没有假期,所以每个方面都有一个主要的刻度标记。 (相比之下,默认休息时间会在星期一进行。由于假期通常在星期一,星期一假日的周数与默认休息时间没有重要的勾号。)但是,请注意,主要休息时间间隔固有根据当周市场开放的天数而变化。

ggplot2:从x轴日期删除周末和假日的空格

#5


1  

Haven't been able to get the OHLC to work - think you'd need a custom geom.

无法让OHLC工作 - 认为你需要一个自定义的geom。

I know it isn't exactly what you asked for, but may I tempt you with a delicious candle chart instead?

我知道这不是你要求的,但我可以用美味的蜡烛图来诱惑你吗?

library(dplyr)
library(bdscale)
library(ggplot2)
library(quantmod)
library(magrittr)
library(scales)

getSymbols("SPY", from = Sys.Date() - 1460, to = Sys.Date(), adjust = TRUE, auto.assign = TRUE)

input <- data.frame(SPY["2015/"]) %>% 
  set_names(c("open", "high", "low", "close", "volume", "adjusted")) %>%
  mutate(date=as.Date(rownames(.)))

input %>% ggplot(aes(x=date, ymin=low, ymax=high, lower=pmin(open,close), upper=pmax(open,close), 
                     fill=open<close, group=date, middle=pmin(open,close))) + 
  geom_boxplot(stat='identity') +
  ggtitle("SPY: 2015") +
  xlab('') + ylab('') + theme(legend.position='none') +
  scale_x_bd(business.dates=input$date, max.major.breaks=10, labels=date_format("%b '%y"))

ggplot2:从x轴日期删除周末和假日的空格