将每天的数据汇总到月/年之间

时间:2022-11-21 16:57:09

I don't often have to work with dates in R, but I imagine this is fairly easy. I have a column that represents a date in a dataframe. I simply want to create a new dataframe that summarizes a 2nd column by Month/Year using the date. What is the best approach?

我不经常需要在R中处理日期,但是我认为这很容易。我有一列表示dataframe中的日期。我只是想创建一个新的dataframe,它用日期来总结第二列。最好的方法是什么?

I want a second dataframe so I can feed it to a plot.

我想要第二个dataframe,这样我就可以把它提供给一个情节。

Any help you can provide will be greatly appreciated!

如果您能提供任何帮助,我们将不胜感激!

EDIT: For reference:

编辑:供参考:

> str(temp)
'data.frame':   215746 obs. of  2 variables:
 $ date  : POSIXct, format: "2011-02-01" "2011-02-01" "2011-02-01" ...
 $ amount: num  1.67 83.55 24.4 21.99 98.88 ...

> head(temp)
        date amount
1 2011-02-01  1.670
2 2011-02-01 83.550
3 2011-02-01 24.400
4 2011-02-01 21.990
5 2011-02-03 98.882
6 2011-02-03 24.900

8 个解决方案

#1


30  

There is probably a more elegant solution, but splitting into months and years with strftime() and then aggregate()ing should do it. Then reassemble the date for plotting.

可能有一种更优雅的解决方案,但是使用strftime()将其分为月和年,然后聚合()ing就可以了。然后重新组合绘制日期。

x <- as.POSIXct(c("2011-02-01", "2011-02-01", "2011-02-01"))
mo <- strftime(x, "%m")
yr <- strftime(x, "%Y")
amt <- runif(3)
dd <- data.frame(mo, yr, amt)

dd.agg <- aggregate(amt ~ mo + yr, dd, FUN = sum)
dd.agg$date <- as.POSIXct(paste(dd.agg$yr, dd.agg$mo, "01", sep = "-"))

#2


44  

I'd do it with lubridate and plyr, rounding dates down to the nearest month to make them easier to plot:

我会使用lubridate和plyr,将日期减到最近的一个月,使它们更容易绘图:

library(lubridate)
df <- data.frame(
  date = today() + days(1:300),
  x = runif(300)
)
df$my <- floor_date(df$date, "month")

library(plyr)
ddply(df, "my", summarise, x = mean(x))

#3


12  

A bit late to the game, but another option would be using data.table:

这个游戏有点晚了,但是另一个选择是使用data.table:

library(data.table)
setDT(temp)[, .(mn_amt = mean(amount)), by = .(yr = year(date), mon = months(date))]

# or if you want to apply the 'mean' function to several columns:
# setDT(temp)[, lapply(.SD, mean), by=.(year(date), month(date))]

this gives:

这给:

     yr      mon mn_amt
1: 2011 februari 42.610
2: 2011    maart 23.195
3: 2011    april 61.891

If you want names instead of numbers for the months, you can use:

如果你想要名字而不是数字,你可以用:

setDT(temp)[, date := as.IDate(date)
            ][, .(mn_amt = mean(amount)), by = .(yr = year(date), mon = months(date))]

this gives:

这给:

     yr      mon mn_amt
1: 2011 februari 42.610
2: 2011    maart 23.195
3: 2011    april 61.891

As you see this will give the month names in your system language (which is Dutch in my case).

正如您所看到的,这将在您的系统语言中给出月份名称(在我的例子中是荷兰语)。


Or using a combination of lubridate and dplyr:

或使用润滑油和dplyr的组合:

temp %>% 
  group_by(yr = year(date), mon = month(date)) %>% 
  summarise(mn_amt = mean(amount))

Used data:

使用数据:

# example data (modified the OP's data a bit)
temp <- structure(list(date = structure(1:6, .Label = c("2011-02-01", "2011-02-02", "2011-03-03", "2011-03-04", "2011-04-05", "2011-04-06"), class = "factor"), 
                       amount = c(1.67, 83.55, 24.4, 21.99, 98.882, 24.9)), 
                  .Names = c("date", "amount"), class = c("data.frame"), row.names = c(NA, -6L))

#4


8  

Just use xts package for this.

使用xts包就可以了。

library(xts)
ts <- xts(temp$amount, as.Date(temp$date, "%Y-%m-%d"))

# convert daily data
ts_m = apply.monthly(ts, FUN)
ts_y = apply.yearly(ts, FUN)
ts_q = apply.quarterly(ts, FUN)

where FUN is a function which you aggregate data with (for example sum)

有趣的是你用它聚合数据的函数(例如sum)

#5


4  

You can do it as:

你可以这样做:

short.date = strftime(temp$date, "%Y/%m")
aggr.stat = aggregate(temp$amount ~ short.date, FUN = sum)

#6


2  

I have a function monyr that I use for this kind of stuff:

我有一个函数monyr用于这类东西

monyr <- function(x)
{
    x <- as.POSIXlt(x)
    x$mday <- 1
    as.Date(x)
}

n <- as.Date(1:500, "1970-01-01")
nn <- monyr(n)

You can change the as.Date at the end to as.POSIXct to match the date format in your data. Summarising by month is then just a matter of using aggregate/by/etc.

你可以改变as。日期以a结尾。与数据中的日期格式匹配。按月进行总结只是使用聚合/by/等。

#7


1  

Also, given that your time series seem to be in xts format, you can aggregate your daily time series to a monthly time series using the mean function like this:

此外,考虑到您的时间序列似乎是xts格式,您可以使用如下的平均函数将每日时间序列聚合为每月时间序列:

d2m <- function(x) {
  aggregate(x, format(as.Date(zoo::index(x)), "%Y-%m"), FUN=mean)
}

#8


0  

One more solution:

一个解决方案:

 rowsum(temp$amount, format(temp$date,"%Y-%m"))

For plot you could use barplot:

你可以用barplot:

barplot(t(rowsum(temp$amount, format(temp$date,"%Y-%m"))), las=2)

#1


30  

There is probably a more elegant solution, but splitting into months and years with strftime() and then aggregate()ing should do it. Then reassemble the date for plotting.

可能有一种更优雅的解决方案,但是使用strftime()将其分为月和年,然后聚合()ing就可以了。然后重新组合绘制日期。

x <- as.POSIXct(c("2011-02-01", "2011-02-01", "2011-02-01"))
mo <- strftime(x, "%m")
yr <- strftime(x, "%Y")
amt <- runif(3)
dd <- data.frame(mo, yr, amt)

dd.agg <- aggregate(amt ~ mo + yr, dd, FUN = sum)
dd.agg$date <- as.POSIXct(paste(dd.agg$yr, dd.agg$mo, "01", sep = "-"))

#2


44  

I'd do it with lubridate and plyr, rounding dates down to the nearest month to make them easier to plot:

我会使用lubridate和plyr,将日期减到最近的一个月,使它们更容易绘图:

library(lubridate)
df <- data.frame(
  date = today() + days(1:300),
  x = runif(300)
)
df$my <- floor_date(df$date, "month")

library(plyr)
ddply(df, "my", summarise, x = mean(x))

#3


12  

A bit late to the game, but another option would be using data.table:

这个游戏有点晚了,但是另一个选择是使用data.table:

library(data.table)
setDT(temp)[, .(mn_amt = mean(amount)), by = .(yr = year(date), mon = months(date))]

# or if you want to apply the 'mean' function to several columns:
# setDT(temp)[, lapply(.SD, mean), by=.(year(date), month(date))]

this gives:

这给:

     yr      mon mn_amt
1: 2011 februari 42.610
2: 2011    maart 23.195
3: 2011    april 61.891

If you want names instead of numbers for the months, you can use:

如果你想要名字而不是数字,你可以用:

setDT(temp)[, date := as.IDate(date)
            ][, .(mn_amt = mean(amount)), by = .(yr = year(date), mon = months(date))]

this gives:

这给:

     yr      mon mn_amt
1: 2011 februari 42.610
2: 2011    maart 23.195
3: 2011    april 61.891

As you see this will give the month names in your system language (which is Dutch in my case).

正如您所看到的,这将在您的系统语言中给出月份名称(在我的例子中是荷兰语)。


Or using a combination of lubridate and dplyr:

或使用润滑油和dplyr的组合:

temp %>% 
  group_by(yr = year(date), mon = month(date)) %>% 
  summarise(mn_amt = mean(amount))

Used data:

使用数据:

# example data (modified the OP's data a bit)
temp <- structure(list(date = structure(1:6, .Label = c("2011-02-01", "2011-02-02", "2011-03-03", "2011-03-04", "2011-04-05", "2011-04-06"), class = "factor"), 
                       amount = c(1.67, 83.55, 24.4, 21.99, 98.882, 24.9)), 
                  .Names = c("date", "amount"), class = c("data.frame"), row.names = c(NA, -6L))

#4


8  

Just use xts package for this.

使用xts包就可以了。

library(xts)
ts <- xts(temp$amount, as.Date(temp$date, "%Y-%m-%d"))

# convert daily data
ts_m = apply.monthly(ts, FUN)
ts_y = apply.yearly(ts, FUN)
ts_q = apply.quarterly(ts, FUN)

where FUN is a function which you aggregate data with (for example sum)

有趣的是你用它聚合数据的函数(例如sum)

#5


4  

You can do it as:

你可以这样做:

short.date = strftime(temp$date, "%Y/%m")
aggr.stat = aggregate(temp$amount ~ short.date, FUN = sum)

#6


2  

I have a function monyr that I use for this kind of stuff:

我有一个函数monyr用于这类东西

monyr <- function(x)
{
    x <- as.POSIXlt(x)
    x$mday <- 1
    as.Date(x)
}

n <- as.Date(1:500, "1970-01-01")
nn <- monyr(n)

You can change the as.Date at the end to as.POSIXct to match the date format in your data. Summarising by month is then just a matter of using aggregate/by/etc.

你可以改变as。日期以a结尾。与数据中的日期格式匹配。按月进行总结只是使用聚合/by/等。

#7


1  

Also, given that your time series seem to be in xts format, you can aggregate your daily time series to a monthly time series using the mean function like this:

此外,考虑到您的时间序列似乎是xts格式,您可以使用如下的平均函数将每日时间序列聚合为每月时间序列:

d2m <- function(x) {
  aggregate(x, format(as.Date(zoo::index(x)), "%Y-%m"), FUN=mean)
}

#8


0  

One more solution:

一个解决方案:

 rowsum(temp$amount, format(temp$date,"%Y-%m"))

For plot you could use barplot:

你可以用barplot:

barplot(t(rowsum(temp$amount, format(temp$date,"%Y-%m"))), las=2)

相关文章